智能论文笔记

A model of semantic completion in generative episodic memory

Zahra Fayyaz , Aya Altamimi , Sen Cheng , Laurenz Wiskott

分类：计算机视觉 | 机器学习

2021-11-26

许多不同的研究表明，发作内存是一种生成过程，但大多数计算模型采用存储视图。在这项工作中，我们提出了一种用于生成的eoisodic记忆的计算模型。它基于中央假设，即海马存储和检索作为存储器迹线的集发作的选定方面，这必然不完整。在召回时，Neocortex在我们称之为语义完成的过程中，基于一般语义信息合理地填写缺失的信息。作为剧集我们使用代表上下文的不同背景增强数字（MNIST）的图像。我们的模型基于VQ-VAE，其以索引矩阵的形式产生压缩潜在表示，其仍然具有一些空间分辨率。我们假设注意其他人被丢弃的索引矩阵的某些部分，然后表示剧集的主旨，并存储为内存跟踪。在调用缺失的部件时，通过PixelCNN填充，建模语义完成，然后由VQ-VAE解码为完整图像。该模型能够以语义合理的方式完成存储器迹线的丢失部分，直到它可以从头开始产生合理图像的点。由于索引矩阵中的组合学，模型将概括为未培训的图像。压缩以及语义完成有助于对内存要求的强烈降低和对噪声的鲁棒性。最后，我们还模拟了一个eoicodic存储器实验，并且可以重现语义一致上下文总是比不一致的语调更好地召回，高度关注水平提高两种情况下的记忆精度，并且不记得正确的背景更常常以完全错误的方式记住。

translated by 谷歌翻译

Falsification of Learning-Based Controllers through Multi-Fidelity Bayesian Optimization

Zahra Shahrooei , Mykel J. Kochenderfer , Ali Baheri

分类：机器学习

2022-12-28

Simulation-based falsification is a practical testing method to increase confidence that the system will meet safety requirements. Because full-fidelity simulations can be computationally demanding, we investigate the use of simulators with different levels of fidelity. As a first step, we express the overall safety specification in terms of environmental parameters and structure this safety specification as an optimization problem. We propose a multi-fidelity falsification framework using Bayesian optimization, which is able to determine at which level of fidelity we should conduct a safety evaluation in addition to finding possible instances from the environment that cause the system to fail. This method allows us to automatically switch between inexpensive, inaccurate information from a low-fidelity simulator and expensive, accurate information from a high-fidelity simulator in a cost-effective way. Our experiments on various environments in simulation demonstrate that multi-fidelity Bayesian optimization has falsification performance comparable to single-fidelity Bayesian optimization but with much lower cost.

translated by 谷歌翻译

SynCLay: Interactive Synthesis of Histology Images from Bespoke Cellular Layouts

Srijay Deshpande , Muhammad Dawood , Fayyaz Minhas , Nasir Rajpoot

分类：计算机视觉 | 机器学习

2022-12-28

Automated synthesis of histology images has several potential applications in computational pathology. However, no existing method can generate realistic tissue images with a bespoke cellular layout or user-defined histology parameters. In this work, we propose a novel framework called SynCLay (Synthesis from Cellular Layouts) that can construct realistic and high-quality histology images from user-defined cellular layouts along with annotated cellular boundaries. Tissue image generation based on bespoke cellular layouts through the proposed framework allows users to generate different histological patterns from arbitrary topological arrangement of different types of cells. SynCLay generated synthetic images can be helpful in studying the role of different types of cells present in the tumor microenvironmet. Additionally, they can assist in balancing the distribution of cellular counts in tissue images for designing accurate cellular composition predictors by minimizing the effects of data imbalance. We train SynCLay in an adversarial manner and integrate a nuclear segmentation and classification model in its training to refine nuclear structures and generate nuclear masks in conjunction with synthetic images. During inference, we combine the model with another parametric model for generating colon images and associated cellular counts as annotations given the grade of differentiation and cell densities of different cells. We assess the generated images quantitatively and report on feedback from trained pathologists who assigned realism scores to a set of images generated by the framework. The average realism score across all pathologists for synthetic images was as high as that for the real images. We also show that augmenting limited real data with the synthetic data generated by our framework can significantly boost prediction performance of the cellular composition prediction task.

translated by 谷歌翻译

Adaptive Blind Watermarking Using Psychovisual Image Features

Arezoo PariZanganeh , Ghazaleh Ghorbanzadeh , Zahra Nabizadeh ShahreBabak , Nader Karimi , Shadrokh Samavi

分类：计算机视觉

2022-12-25

With the growth of editing and sharing images through the internet, the importance of protecting the images' authorship has increased. Robust watermarking is a known approach to maintaining copyright protection. Robustness and imperceptibility are two factors that are tried to be maximized through watermarking. Usually, there is a trade-off between these two parameters. Increasing the robustness would lessen the imperceptibility of the watermarking. This paper proposes an adaptive method that determines the strength of the watermark embedding in different parts of the cover image regarding its texture and brightness. Adaptive embedding increases the robustness while preserving the quality of the watermarked image. Experimental results also show that the proposed method can effectively reconstruct the embedded payload in different kinds of common watermarking attacks. Our proposed method has shown good performance compared to a recent technique.

translated by 谷歌翻译

COVID-19 Classification Using Deep Learning Two-Stage Approach

Mostapha Alsaidi , Ali Saleem Altaher , Muhammad Tanveer Jan , Ahmed Altaher , Zahra Salekshahrezaee

分类：计算机视觉 | 机器学习

2022-11-28

In this paper, deep-learning-based approaches namely fine-tuning of pretrained convolutional neural networks (VGG16 and VGG19), and end-to-end training of a developed CNN model, have been used in order to classify X-Ray images into four different classes that include COVID-19, normal, opacity and pneumonia cases. A dataset containing more than 20,000 X-ray scans was retrieved from Kaggle and used in this experiment. A two-stage classification approach was implemented to be compared to the one-shot classification approach. Our hypothesis was that a two-stage model will be able to achieve better performance than a one-shot model. Our results show otherwise as VGG16 achieved 95% accuracy using one-shot approach over 5-fold of training. Future work will focus on a more robust implementation of the two-stage classification model Covid-TSC. The main improvement will be allowing data to flow from the output of stage-1 to the input of stage-2, where stage-1 and stage-2 models are VGG16 models fine-tuned on the Covid-19 dataset.

translated by 谷歌翻译

Automated Deep Aberration Detection from Chromosome Karyotype Images

Zahra Shamsi , Drew Bryant , Jacob Wilson , Xiaoyu Qu , Avinava Dubey , Konik Kothari , Mostafa Dehghani , Mariya Chavarha , Valerii Likhosherstov , Brian Williams

分类：计算机视觉 | 机器学习

2022-11-20

Chromosome analysis is essential for diagnosing genetic disorders. For hematologic malignancies, identification of somatic clonal aberrations by karyotype analysis remains the standard of care. However, karyotyping is costly and time-consuming because of the largely manual process and the expertise required in identifying and annotating aberrations. Efforts to automate karyotype analysis to date fell short in aberration detection. Using a training set of ~10k patient specimens and ~50k karyograms from over 5 years from the Fred Hutchinson Cancer Center, we created a labeled set of images representing individual chromosomes. These individual chromosomes were used to train and assess deep learning models for classifying the 24 human chromosomes and identifying chromosomal aberrations. The top-accuracy models utilized the recently introduced Topological Vision Transformers (TopViTs) with 2-level-block-Toeplitz masking, to incorporate structural inductive bias. TopViT outperformed CNN (Inception) models with >99.3% accuracy for chromosome identification, and exhibited accuracies >99% for aberration detection in most aberrations. Notably, we were able to show high-quality performance even in "few shot" learning scenarios. Incorporating the definition of clonality substantially improved both precision and recall (sensitivity). When applied to "zero shot" scenarios, the model captured aberrations without training, with perfect precision at >50% recall. Together these results show that modern deep learning models can approach expert-level performance for chromosome aberration detection. To our knowledge, this is the first study demonstrating the downstream effectiveness of TopViTs. These results open up exciting opportunities for not only expediting patient results but providing a scalable technology for early screening of low-abundance chromosomal lesions.

translated by 谷歌翻译

BERT on a Data Diet: Finding Important Examples by Gradient-Based Pruning

Mohsen Fayyaz , Ehsan Aghazadeh , Ali Modarressi , Mohammad Taher Pilehvar , Yadollah Yaghoobzadeh , Samira Ebrahimi Kahou

分类：自然语言处理

2022-11-10

Current pre-trained language models rely on large datasets for achieving state-of-the-art performance. However, past research has shown that not all examples in a dataset are equally important during training. In fact, it is sometimes possible to prune a considerable fraction of the training set while maintaining the test performance. Established on standard vision benchmarks, two gradient-based scoring metrics for finding important examples are GraNd and its estimated version, EL2N. In this work, we employ these two metrics for the first time in NLP. We demonstrate that these metrics need to be computed after at least one epoch of fine-tuning and they are not reliable in early steps. Furthermore, we show that by pruning a small portion of the examples with the highest GraNd/EL2N scores, we can not only preserve the test accuracy, but also surpass it. This paper details adjustments and implementation choices which enable GraNd and EL2N to be applied to NLP.

translated by 谷歌翻译

Deep Learning for Time Series Anomaly Detection: A Survey

Zahra Zamanzadeh Darban , Geoffrey I. Webb , Shirui Pan , Charu C. Aggarwal , Mahsa Salehi

分类：机器学习 | 人工智能

2022-11-09

Time series anomaly detection has applications in a wide range of research fields and applications, including manufacturing and healthcare. The presence of anomalies can indicate novel or unexpected events, such as production faults, system defects, or heart fluttering, and is therefore of particular interest. The large size and complex patterns of time series have led researchers to develop specialised deep learning models for detecting anomalous patterns. This survey focuses on providing structured and comprehensive state-of-the-art time series anomaly detection models through the use of deep learning. It providing a taxonomy based on the factors that divide anomaly detection models into different categories. Aside from describing the basic anomaly detection technique for each category, the advantages and limitations are also discussed. Furthermore, this study includes examples of deep anomaly detection in time series across various application domains in recent years. It finally summarises open issues in research and challenges faced while adopting deep anomaly detection models.

translated by 谷歌翻译

Multi-Fidelity Cost-Aware Bayesian Optimization

Zahra Zanjani Foumani , Mehdi Shishehbor , Amin Yousefpour , Ramin Bostanabad

分类： (统计)机器学习

2022-11-04

Bayesian optimization (BO) is increasingly employed in critical applications such as materials design and drug discovery. An increasingly popular strategy in BO is to forgo the sole reliance on high-fidelity data and instead use an ensemble of information sources which provide inexpensive low-fidelity data. The overall premise of this strategy is to reduce the overall sampling costs by querying inexpensive low-fidelity sources whose data are correlated with high-fidelity samples. Here, we propose a multi-fidelity cost-aware BO framework that dramatically outperforms the state-of-the-art technologies in terms of efficiency, consistency, and robustness. We demonstrate the advantages of our framework on analytic and engineering problems and argue that these benefits stem from our two main contributions: (1) we develop a novel acquisition function for multi-fidelity cost-aware BO that safeguards the convergence against the biases of low-fidelity data, and (2) we tailor a newly developed emulator for multi-fidelity BO which enables us to not only simultaneously learn from an ensemble of multi-fidelity datasets, but also identify the severely biased low-fidelity sources that should be excluded from BO.

translated by 谷歌翻译

Machine Learning Challenges of Biological Factors in Insect Image Data

Nicholas Pellegrino , Zahra Gharaee , Paul Fieguth

分类：计算机视觉

2022-11-04

The BIOSCAN project, led by the International Barcode of Life Consortium, seeks to study changes in biodiversity on a global scale. One component of the project is focused on studying the species interaction and dynamics of all insects. In addition to genetically barcoding insects, over 1.5 million images per year will be collected, each needing taxonomic classification. With the immense volume of incoming images, relying solely on expert taxonomists to label the images would be impossible; however, artificial intelligence and computer vision technology may offer a viable high-throughput solution. Additional tasks including manually weighing individual insects to determine biomass, remain tedious and costly. Here again, computer vision may offer an efficient and compelling alternative. While the use of computer vision methods is appealing for addressing these problems, significant challenges resulting from biological factors present themselves. These challenges are formulated in the context of machine learning in this paper.

translated by 谷歌翻译