智能论文笔记

Demo: Untethered Haptic Teleoperation for Nuclear Decommissioning using a Low-Power Wireless Control Technology

Joseph Bolarinwa , Alex Smith , Adnan Aijaz , Aleksandar Stanoev , Manuel Giuliani

分类：机器人

2022-06-27

触觉远程操作通常是通过有线网络技术（例如以太网）实现的，这些技术可以保证控制循环的性能在通信媒体上封闭，尤其是在延迟，抖动和可靠性方面。该演示表明，在核退役用例中，在一种名为Gallop的新型低功率无线控制技术（称为Gallop）上进行触觉遥控的能力。它显示了疾驰的生存能力，可以满足触觉远程运行的延迟，及时性和安全要求。作为演示的一部分进行的评估表明，在现成的蓝牙5.0芯片组上实施的疾驰可以替代传统的有线TCP/IP连接，并且在同一用例中胜过基于WiFi的无线解决方案。

translated by 谷歌翻译

SpaceEdit: Learning a Unified Editing Space for Open-Domain Image Editing

Jing Shi , Ning Xu , Haitian Zheng , Alex Smith , Jiebo Luo , Chenliang Xu

分类：计算机视觉

2021-11-30

最近，大型预磨损模型（例如，BERT，STYLEGAN，CLIP）在其域内的各种下游任务中表现出很好的知识转移和泛化能力。在这些努力的启发中，在本文中，我们提出了一个统一模型，用于开放域图像编辑，重点是开放式域图像的颜色和音调调整，同时保持原始内容和结构。我们的模型了解许多现有照片编辑软件中使用的操作空间（例如，对比度，亮度，颜色曲线）更具语义，直观，易于操作的统一编辑空间。我们的模型属于图像到图像转换框架，由图像编码器和解码器组成，并且在图像之前和图像的成对上培训以产生多模式输出。我们认为，通过将图像对反馈到学习编辑空间的潜在代码中，我们的模型可以利用各种下游编辑任务，例如语言引导图像编辑，个性化编辑，编辑式聚类，检索等。我们广泛地研究实验中编辑空间的独特属性，并在上述任务上展示了卓越的性能。

translated by 谷歌翻译

Google's Neural Machine Translation System: Bridging the Gap between Human and Machine Translation

Yonghui Wu , Mike Schuster , Zhifeng Chen , Quoc V. Le , Mohammad Norouzi , Wolfgang Macherey , Maxim Krikun , Yuan Cao , Qin Gao , Klaus Macherey

分类：

2016-09-26

Neural Machine Translation (NMT) is an end-to-end learning approach for automated translation, with the potential to overcome many of the weaknesses of conventional phrase-based translation systems. Unfortunately, NMT systems are known to be computationally expensive both in training and in translation inference -sometimes prohibitively so in the case of very large data sets and large models. Several authors have also charged that NMT systems lack robustness, particularly when input sentences contain rare words. These issues have hindered NMT's use in practical deployments and services, where both accuracy and speed are essential. In this work, we present GNMT, Google's Neural Machine Translation system, which attempts to address many of these issues. Our model consists of a deep LSTM network with 8 encoder and 8 decoder layers using residual connections as well as attention connections from the decoder network to the encoder. To improve parallelism and therefore decrease training time, our attention mechanism connects the bottom layer of the decoder to the top layer of the encoder. To accelerate the final translation speed, we employ low-precision arithmetic during inference computations. To improve handling of rare words, we divide words into a limited set of common sub-word units ("wordpieces") for both input and output. This method provides a good balance between the flexibility of "character"-delimited models and the efficiency of "word"-delimited models, naturally handles translation of rare words, and ultimately improves the overall accuracy of the system. Our beam search technique employs a length-normalization procedure and uses a coverage penalty, which encourages generation of an output sentence that is most likely to cover all the words in the source sentence. To directly optimize the translation BLEU scores, we consider refining the models by using reinforcement learning, but we found that the improvement in the BLEU scores did not reflect in the human evaluation. On the WMT'14 English-to-French and English-to-German benchmarks, GNMT achieves competitive results to state-of-the-art. Using a human side-by-side evaluation on a set of isolated simple sentences, it reduces translation errors by an average of 60% compared to Google's phrase-based production system.

translated by 谷歌翻译

Causal Deep Learning: Causal Capsules and Tensor Transformers

M. Alex O. Vasilescu

分类：机器学习 | 计算机视觉

2023-01-01

We derive a set of causal deep neural networks whose architectures are a consequence of tensor (multilinear) factor analysis. Forward causal questions are addressed with a neural network architecture composed of causal capsules and a tensor transformer. The former estimate a set of latent variables that represent the causal factors, and the latter governs their interaction. Causal capsules and tensor transformers may be implemented using shallow autoencoders, but for a scalable architecture we employ block algebra and derive a deep neural network composed of a hierarchy of autoencoders. An interleaved kernel hierarchy preprocesses the data resulting in a hierarchy of kernel tensor factor models. Inverse causal questions are addressed with a neural network that implements multilinear projection and estimates the causes of effects. As an alternative to aggressive bottleneck dimension reduction or regularized regression that may camouflage an inherently underdetermined inverse problem, we prescribe modeling different aspects of the mechanism of data formation with piecewise tensor models whose multilinear projections are well-defined and produce multiple candidate solutions. Our forward and inverse neural network architectures are suitable for asynchronous parallel computation.

translated by 谷歌翻译

Skeletal Video Anomaly Detection using Deep Learning: Survey, Challenges and Future Directions

Pratik K. Mishra , Alex Mihailidis , Shehroz S. Khan

分类：计算机视觉

2022-12-31

The existing methods for video anomaly detection mostly utilize videos containing identifiable facial and appearance-based features. The use of videos with identifiable faces raises privacy concerns, especially when used in a hospital or community-based setting. Appearance-based features can also be sensitive to pixel-based noise, straining the anomaly detection methods to model the changes in the background and making it difficult to focus on the actions of humans in the foreground. Structural information in the form of skeletons describing the human motion in the videos is privacy-protecting and can overcome some of the problems posed by appearance-based features. In this paper, we present a survey of privacy-protecting deep learning anomaly detection methods using skeletons extracted from videos. We present a novel taxonomy of algorithms based on the various learning approaches. We conclude that skeleton-based approaches for anomaly detection can be a plausible privacy-protecting alternative for video anomaly detection. Lastly, we identify major open research questions and provide guidelines to address them.

translated by 谷歌翻译

Representation Learning in Deep RL via Discrete Information Bottleneck

Riashat Islam , Hongyu Zang , Manan Tomar , Aniket Didolkar , Md Mofijul Islam , Samin Yeasar Arnob , Tariq Iqbal , Xin Li , Anirudh Goyal , Nicolas Heess

分类：机器学习

2022-12-28

Several self-supervised representation learning methods have been proposed for reinforcement learning (RL) with rich observations. For real-world applications of RL, recovering underlying latent states is crucial, particularly when sensory inputs contain irrelevant and exogenous information. In this work, we study how information bottlenecks can be used to construct latent states efficiently in the presence of task-irrelevant information. We propose architectures that utilize variational and discrete information bottlenecks, coined as RepDIB, to learn structured factorized representations. Exploiting the expressiveness bought by factorized representations, we introduce a simple, yet effective, bottleneck that can be integrated with any existing self-supervised objective for RL. We demonstrate this across several online and offline RL benchmarks, along with a real robot arm task, where we find that compressed representations with RepDIB can lead to strong performance improvements, as the learned bottlenecks help predict only the relevant state while ignoring irrelevant information.

translated by 谷歌翻译

Metadata-guided Consistency Learning for High Content Images

Johan Fredin Haslum , Christos Matsoukas , Karl-Johan Leuchowius , Erik Müllers , Kevin Smith

分类：计算机视觉

2022-12-22

High content imaging assays can capture rich phenotypic response data for large sets of compound treatments, aiding in the characterization and discovery of novel drugs. However, extracting representative features from high content images that can capture subtle nuances in phenotypes remains challenging. The lack of high-quality labels makes it difficult to achieve satisfactory results with supervised deep learning. Self-Supervised learning methods, which learn from automatically generated labels has shown great success on natural images, offer an attractive alternative also to microscopy images. However, we find that self-supervised learning techniques underperform on high content imaging assays. One challenge is the undesirable domain shifts present in the data known as batch effects, which may be caused by biological noise or uncontrolled experimental conditions. To this end, we introduce Cross-Domain Consistency Learning (CDCL), a novel approach that is able to learn in the presence of batch effects. CDCL enforces the learning of biological similarities while disregarding undesirable batch-specific signals, which leads to more useful and versatile representations. These features are organised according to their morphological changes and are more useful for downstream tasks - such as distinguishing treatments and mode of action.

translated by 谷歌翻译

Forecasting West Nile Virus with Graph Neural Networks: Harnessing Spatial Dependence in Irregularly Sampled Geospatial Data

Adam Tonks , Trevor Harris , Bo Li , William Brown , Rebecca Smith

分类：机器学习

2022-12-21

Machine learning methods have seen increased application to geospatial environmental problems, such as precipitation nowcasting, haze forecasting, and crop yield prediction. However, many of the machine learning methods applied to mosquito population and disease forecasting do not inherently take into account the underlying spatial structure of the given data. In our work, we apply a spatially aware graph neural network model consisting of GraphSAGE layers to forecast the presence of West Nile virus in Illinois, to aid mosquito surveillance and abatement efforts within the state. More generally, we show that graph neural networks applied to irregularly sampled geospatial data can exceed the performance of a range of baseline methods including logistic regression, XGBoost, and fully-connected neural networks.

translated by 谷歌翻译

Reconstruction Probing

Najoung Kim , Jatin Khilnani , Alex Warstadt , Abed Qaddoumi

分类：自然语言处理

2022-12-21

We propose reconstruction probing, a new analysis method for contextualized representations based on reconstruction probabilities in masked language models (MLMs). This method relies on comparing the reconstruction probabilities of tokens in a given sequence when conditioned on the representation of a single token that has been fully contextualized and when conditioned on only the decontextualized lexical prior of the model. This comparison can be understood as quantifying the contribution of contextualization towards reconstruction -- the difference in the reconstruction probabilities can only be attributed to the representational change of the single token induced by contextualization. We apply this analysis to three MLMs and find that contextualization boosts reconstructability of tokens that are close to the token being reconstructed in terms of linear and syntactic distance. Furthermore, we extend our analysis to finer-grained decomposition of contextualized representations, and we find that these boosts are largely attributable to static and positional embeddings at the input layer.

translated by 谷歌翻译

Hierarchically branched diffusion models for efficient and interpretable multi-class conditional generation

Alex M. Tseng , Tommaso Biancalani , Max Shen , Gabriele Scalia

分类：机器学习 | 人工智能

2022-12-21

Diffusion models have achieved justifiable popularity by attaining state-of-the-art performance in generating realistic objects from seemingly arbitrarily complex data distributions, including when conditioning generation on labels. Unfortunately, however, their iterative nature renders them very computationally inefficient during the sampling process. For the multi-class conditional generation problem, we propose a novel, structurally unique framework of diffusion models which are hierarchically branched according to the inherent relationships between classes. In this work, we demonstrate that branched diffusion models offer major improvements in efficiently generating samples from multiple classes. We also showcase several other advantages of branched diffusion models, including ease of extension to novel classes in a continual-learning setting, and a unique interpretability that offers insight into these generative models. Branched diffusion models represent an alternative paradigm to their traditional linear counterparts, and can have large impacts in how we use diffusion models for efficient generation, online learning, and scientific discovery.

translated by 谷歌翻译