智能论文笔记

Towards Ground Truth for Single Image Deraining

Yunhao Ba , Howard Zhang , Ethan Yang , Akira Suzuki , Arnold Pfahnl , Chethan Chinder Chandrappa , Celso de Melo , Suya You , Stefano Soatto , Alex Wong

分类：计算机视觉

2022-06-22

我们提出了一个大规模的真实世界和干净的图像对数据集，以及一种从图像中降低降解的方法，从图像中降低了降解。由于没有用于降低的现实世界数据集，因此当前的最新方法依赖于合成数据，因此受SIM2REAL域间隙的限制。此外，由于没有真实的配对数据集，严格的评估仍然是一个挑战。我们通过通过对非鼻子变化的细致控制收集第一个真实的配对数据集来填补这一空白。我们的数据集对各种现实世界的雨水现象（例如雨条和雨水积累）进行了配对的培训和定量评估。为了学习对雨现象不变的代表，我们提出了一个深层神经网络，该网络通过最大程度地减少雨水和干净图像之间的雨水不变损失来重建基础场景。广泛的实验表明，所提出的数据集使现有的DERAINER受益，我们的模型可以在各种条件下对真实雨水图像的最先进方法优于最先进的方法。

translated by 谷歌翻译

Beyond the Imitation Game: Quantifying and extrapolating the capabilities of language models

Aarohi Srivastava , Abhinav Rastogi , Abhishek Rao , Abu Awal Md Shoeb , Abubakar Abid , Adam Fisch , Adam R. Brown , Adam Santoro , Aditya Gupta , Adrià Garriga-Alonso

分类：自然语言处理 | 人工智能 | 机器学习 | (统计)机器学习

2022-06-09

语言模型既展示了定量的改进，又展示了新的定性功能，随着规模的增加。尽管它们具有潜在的变革性影响，但这些新能力的特征却很差。为了为未来的研究提供信息，为破坏性的新模型能力做准备，并改善社会有害的效果，至关重要的是，我们必须了解目前和近乎未来的能力和语言模型的局限性。为了应对这一挑战，我们介绍了超越模仿游戏基准（Big Bench）。 Big Bench目前由204个任务组成，由132家机构的442位作者贡献。任务主题是多样的，从语言学，儿童发展，数学，常识性推理，生物学，物理学，社会偏见，软件开发等等。 Big-Bench专注于被认为超出当前语言模型的功能的任务。我们评估了OpenAI的GPT型号，Google内部密集变压器体系结构和大型基础上的开关稀疏变压器的行为，跨越了数百万到数十亿个参数。此外，一个人类专家评估者团队执行了所有任务，以提供强大的基准。研究结果包括：模型性能和校准都随规模改善，但绝对的术语（以及与评估者的性能相比）；在模型类中的性能非常相似，尽管带有稀疏性。逐渐和预测的任务通常涉及大量知识或记忆成分，而在临界规模上表现出“突破性”行为的任务通常涉及多个步骤或组成部分或脆性指标；社交偏见通常会随着含糊不清的环境而随着规模而增加，但这可以通过提示来改善。

translated by 谷歌翻译

Evaluating Multimodal Interactive Agents

Josh Abramson , Arun Ahuja , Federico Carnevale , Petko Georgiev , Alex Goldin , Alden Hung , Jessica Landon , Timothy Lillicrap , Alistair Muldal , Blake Richards

分类：机器学习 | 人工智能

2022-05-26

创建可以自然与人类互动的代理是人工智能（AI）研究中的共同目标。但是，评估这些互动是具有挑战性的：收集在线人类代理相互作用缓慢而昂贵，但更快的代理指标通常与交互式评估相关。在本文中，我们评估了这些现有评估指标的优点，并提出了一种新颖的评估方法，称为标准化测试套件（STS）。 STS使用从真实人类交互数据中挖掘出的行为方案。代理商请参阅重播方案上下文，接收指令，然后将控制权控制以脱机完成交互。记录这些代理的延续并将其发送给人类注释者以将其标记为成功或失败，并且根据其成功的连续性比例对代理进行排名。最终的ST是自然主义相互作用的快速，控制，可解释的和代表的。总的来说，STS巩固了我们许多标准评估指标中所需的许多值，从而使我们能够加速研究进展，以生产可以自然与人类互动的代理。可以在https://youtu.be/yr1tnggorgq上找到视频。

translated by 谷歌翻译

Stereoscopic Universal Perturbations across Different Architectures and Datasets

Zachary Berger , Parth Agrawal , Tian Yu Liu , Stefano Soatto , Alex Wong

分类：计算机视觉 | 人工智能 | 机器学习

2021-12-12

我们研究了对差距估计任务的深层立体声匹配网络对抗图像对抗的影响。我们介绍了一种方法来制作一组扰动，当添加到数据集中的任何立体声图像对时，可以欺骗立体声网络，从而显着改变感知场景几何形状。我们的扰动图像是“通用”的，因为它们不仅损坏了它们在优化的数据集上的网络上的估计，而且还概括到不同数据集中不同架构的立体网络。我们在多个公共基准数据集中评估我们的方法，并显示我们的扰动可以将最先进的立体网络的D1错误（类似于愚蠢）增加1％至高达87％。我们调查扰动对估计场景几何的影响，并确定最脆弱的对象类。我们对左右图像之间的注册点激活的分析导致我们发现某些架构组件，即可变形卷积和明确匹配，可以增加对对手的鲁棒性。我们证明，通过简单地使用这些组件设计网络，可以将对手的效果降低到60.5％，这竞争于网络的稳健性与昂贵的对抗性数据增强进行了微调。

translated by 谷歌翻译

Creating Multimodal Interactive Agents with Imitation and Self-Supervised Learning

DeepMind Interactive Agents Team , Josh Abramson , Arun Ahuja , Arthur Brussee , Federico Carnevale , Mary Cassin , Felix Fischer , Petko Georgiev , Alex Goldin , Tim Harley

分类：机器学习

2021-12-07

来自科幻小说的普通愿景是机器人将有一天居住在我们的物理空间中，感知世界，才能协助我们的物理劳动力，并通过自然语言与我们沟通。在这里，我们研究如何使用虚拟环境的简化设计如何与人类自然交互的人工代理。我们表明，与自我监督学习的模拟世界中的人类交互的模仿学习足以产生我们称之为MIA的多模式互动剂，这成功与非对抗人类互动75％的时间。我们进一步确定了提高性能的架构和算法技术，例如分层动作选择。完全，我们的结果表明，模仿多模态，实时人类行为可以提供具有丰富的行为的富含性的令人生意的和令人惊讶的有效手段，然后可以为特定目的进行微调，从而铺设基础用于培训互动机器人或数字助理的能力。可以在https://youtu.be/zfgrif7my找到MIA的行为的视频

translated by 谷歌翻译

Causal Deep Learning: Causal Capsules and Tensor Transformers

M. Alex O. Vasilescu

分类：机器学习 | 计算机视觉

2023-01-01

We derive a set of causal deep neural networks whose architectures are a consequence of tensor (multilinear) factor analysis. Forward causal questions are addressed with a neural network architecture composed of causal capsules and a tensor transformer. The former estimate a set of latent variables that represent the causal factors, and the latter governs their interaction. Causal capsules and tensor transformers may be implemented using shallow autoencoders, but for a scalable architecture we employ block algebra and derive a deep neural network composed of a hierarchy of autoencoders. An interleaved kernel hierarchy preprocesses the data resulting in a hierarchy of kernel tensor factor models. Inverse causal questions are addressed with a neural network that implements multilinear projection and estimates the causes of effects. As an alternative to aggressive bottleneck dimension reduction or regularized regression that may camouflage an inherently underdetermined inverse problem, we prescribe modeling different aspects of the mechanism of data formation with piecewise tensor models whose multilinear projections are well-defined and produce multiple candidate solutions. Our forward and inverse neural network architectures are suitable for asynchronous parallel computation.

translated by 谷歌翻译

Skeletal Video Anomaly Detection using Deep Learning: Survey, Challenges and Future Directions

Pratik K. Mishra , Alex Mihailidis , Shehroz S. Khan

分类：计算机视觉

2022-12-31

The existing methods for video anomaly detection mostly utilize videos containing identifiable facial and appearance-based features. The use of videos with identifiable faces raises privacy concerns, especially when used in a hospital or community-based setting. Appearance-based features can also be sensitive to pixel-based noise, straining the anomaly detection methods to model the changes in the background and making it difficult to focus on the actions of humans in the foreground. Structural information in the form of skeletons describing the human motion in the videos is privacy-protecting and can overcome some of the problems posed by appearance-based features. In this paper, we present a survey of privacy-protecting deep learning anomaly detection methods using skeletons extracted from videos. We present a novel taxonomy of algorithms based on the various learning approaches. We conclude that skeleton-based approaches for anomaly detection can be a plausible privacy-protecting alternative for video anomaly detection. Lastly, we identify major open research questions and provide guidelines to address them.

translated by 谷歌翻译

Robust machine learning pipelines for trading market-neutral stock portfolios

Thomas Wong , Mauricio Barahona

分类：机器学习

2022-12-30

The application of deep learning algorithms to financial data is difficult due to heavy non-stationarities which can lead to over-fitted models that underperform under regime changes. Using the Numerai tournament data set as a motivating example, we propose a machine learning pipeline for trading market-neutral stock portfolios based on tabular data which is robust under changes in market conditions. We evaluate various machine-learning models, including Gradient Boosting Decision Trees (GBDTs) and Neural Networks with and without simple feature engineering, as the building blocks for the pipeline. We find that GBDT models with dropout display high performance, robustness and generalisability with relatively low complexity and reduced computational cost. We then show that online learning techniques can be used in post-prediction processing to enhance the results. In particular, dynamic feature neutralisation, an efficient procedure that requires no retraining of models and can be applied post-prediction to any machine learning model, improves robustness by reducing drawdown in volatile market conditions. Furthermore, we demonstrate that the creation of model ensembles through dynamic model selection based on recent model performance leads to improved performance over baseline by improving the Sharpe and Calmar ratios. We also evaluate the robustness of our pipeline across different data splits and random seeds with good reproducibility of results.

translated by 谷歌翻译

Representation Learning in Deep RL via Discrete Information Bottleneck

Riashat Islam , Hongyu Zang , Manan Tomar , Aniket Didolkar , Md Mofijul Islam , Samin Yeasar Arnob , Tariq Iqbal , Xin Li , Anirudh Goyal , Nicolas Heess

分类：机器学习

2022-12-28

Several self-supervised representation learning methods have been proposed for reinforcement learning (RL) with rich observations. For real-world applications of RL, recovering underlying latent states is crucial, particularly when sensory inputs contain irrelevant and exogenous information. In this work, we study how information bottlenecks can be used to construct latent states efficiently in the presence of task-irrelevant information. We propose architectures that utilize variational and discrete information bottlenecks, coined as RepDIB, to learn structured factorized representations. Exploiting the expressiveness bought by factorized representations, we introduce a simple, yet effective, bottleneck that can be integrated with any existing self-supervised objective for RL. We demonstrate this across several online and offline RL benchmarks, along with a real robot arm task, where we find that compressed representations with RepDIB can lead to strong performance improvements, as the learned bottlenecks help predict only the relevant state while ignoring irrelevant information.

translated by 谷歌翻译

AER: Auto-Encoder with Regression for Time Series Anomaly Detection

Lawrence Wong , Dongyu Liu , Laure Berti-Equille , Sarah Alnegheimish , Kalyan Veeramachaneni

分类：机器学习 | (统计)机器学习

2022-12-27

Anomaly detection on time series data is increasingly common across various industrial domains that monitor metrics in order to prevent potential accidents and economic losses. However, a scarcity of labeled data and ambiguous definitions of anomalies can complicate these efforts. Recent unsupervised machine learning methods have made remarkable progress in tackling this problem using either single-timestamp predictions or time series reconstructions. While traditionally considered separately, these methods are not mutually exclusive and can offer complementary perspectives on anomaly detection. This paper first highlights the successes and limitations of prediction-based and reconstruction-based methods with visualized time series signals and anomaly scores. We then propose AER (Auto-encoder with Regression), a joint model that combines a vanilla auto-encoder and an LSTM regressor to incorporate the successes and address the limitations of each method. Our model can produce bi-directional predictions while simultaneously reconstructing the original time series by optimizing a joint objective function. Furthermore, we propose several ways of combining the prediction and reconstruction errors through a series of ablation studies. Finally, we compare the performance of the AER architecture against two prediction-based methods and three reconstruction-based methods on 12 well-known univariate time series datasets from NASA, Yahoo, Numenta, and UCR. The results show that AER has the highest averaged F1 score across all datasets (a 23.5% improvement compared to ARIMA) while retaining a runtime similar to its vanilla auto-encoder and regressor components. Our model is available in Orion, an open-source benchmarking tool for time series anomaly detection.

translated by 谷歌翻译