智能论文笔记

An Outlier Exposure Approach to Improve Visual Anomaly Detection Performance for Mobile Robots

Dario Mantegazza , Alessandro Giusti , Luca Maria Gambardella , Jérôme Guzzi

分类：计算机视觉 | 人工智能 | 机器人

2022-09-20

我们考虑为移动机器人构建视觉异常检测系统的问题。标准异常检测模型是使用仅由非异常数据组成的大型数据集训练的。但是，在机器人技术应用中，通常可以使用（可能很少）的异常示例。我们解决了利用这些数据以通过与Real-NVP损失共同使辅助外离群损失损失共同使实际NVP异常检测模型的性能提高性能的问题。我们在新的数据集（作为补充材料）上进行定量实验，该数据集在室内巡逻方案中设计为异常检测。在不连接测试集中，我们的方法优于替代方案，并表明即使少数异常框架也可以实现重大的性能改进。

translated by 谷歌翻译

Challenges in Visual Anomaly Detection for Mobile Robots

Dario Mantegazza , Alessandro Giusti , Luca M. Gambardella , Andrea Rizzoli , Jérôme Guzzi

分类：计算机视觉 | 机器人 | (统计)机器学习

2022-09-22

我们考虑根据视觉检测自动移动机器人异常的任务。我们对相关类型的视觉异常进行分类，并讨论如何通过无监督的深度学习方法检测到它们。我们提出了一个专门为此任务构建的新型数据集，并在该任务上测试了最先进的方法。我们终于在实际情况下讨论部署。

translated by 谷歌翻译

DeepProphet2 -- A Deep Learning Gene Recommendation Engine

Daniele Brambilla , Davide Maria Giacomini , Luca Muscarnera , Andrea Mazzoleni

分类：机器学习

2022-08-03

最近的机器学习进展创造了解决生活科学问题的新的强大工具。本文的目的是讨论人工智能（AI）执行的基因推荐的潜在优势。确实，基因推荐引擎试图解决此问题：如果用户对一组基因感兴趣，那么其他基因可能与起始集有关，应该研究？该任务是通过自定义深度学习推荐引擎Deepprophet2（DP2）解决的，该引擎可通过www.generecommender.com在全球研究人员免费获得。此后，说明了算法背后的见解及其实际应用。可以通过将基因映射到可以定义距离以表示它们之间的真实语义距离的度量空间来解决基因推荐问题。为了实现这一目标，基于变压器的模型已在PubMed的一个自由策划的纸质语料库上进行了培训。该论文描述了用于获得最佳偏见变化权衡的多个优化程序，重点是嵌入尺寸和网络深度。在这种情况下，通过交叉验证评估了模型发现与疾病和途径有关的基因集的能力。一个简单的假设指导了该过程：网络没有直接了解途径和疾病的知识，而是学会了基因的相似性及其之间的相互作用。此外，为了进一步研究神经网络代表基因的空间，嵌入的维度减少了，结果被投影到了可忽视的空间上。总之，一组用例说明了该算法在真实单词设置中的潜在应用。

translated by 谷歌翻译

Beyond the Imitation Game: Quantifying and extrapolating the capabilities of language models

Aarohi Srivastava , Abhinav Rastogi , Abhishek Rao , Abu Awal Md Shoeb , Abubakar Abid , Adam Fisch , Adam R. Brown , Adam Santoro , Aditya Gupta , Adrià Garriga-Alonso

分类：自然语言处理 | 人工智能 | 机器学习 | (统计)机器学习

2022-06-09

语言模型既展示了定量的改进，又展示了新的定性功能，随着规模的增加。尽管它们具有潜在的变革性影响，但这些新能力的特征却很差。为了为未来的研究提供信息，为破坏性的新模型能力做准备，并改善社会有害的效果，至关重要的是，我们必须了解目前和近乎未来的能力和语言模型的局限性。为了应对这一挑战，我们介绍了超越模仿游戏基准（Big Bench）。 Big Bench目前由204个任务组成，由132家机构的442位作者贡献。任务主题是多样的，从语言学，儿童发展，数学，常识性推理，生物学，物理学，社会偏见，软件开发等等。 Big-Bench专注于被认为超出当前语言模型的功能的任务。我们评估了OpenAI的GPT型号，Google内部密集变压器体系结构和大型基础上的开关稀疏变压器的行为，跨越了数百万到数十亿个参数。此外，一个人类专家评估者团队执行了所有任务，以提供强大的基准。研究结果包括：模型性能和校准都随规模改善，但绝对的术语（以及与评估者的性能相比）；在模型类中的性能非常相似，尽管带有稀疏性。逐渐和预测的任务通常涉及大量知识或记忆成分，而在临界规模上表现出“突破性”行为的任务通常涉及多个步骤或组成部分或脆性指标；社交偏见通常会随着含糊不清的环境而随着规模而增加，但这可以通过提示来改善。

translated by 谷歌翻译

Sensing Anomalies as Potential Hazards: Datasets and Benchmarks

Dario Mantegazza , Carlos Redondo , Fran Espada , Luca M. Gambardella , Alessandro Giusti , Jérôme Guzzi

分类：机器人 | 人工智能 | 计算机视觉 | (统计)机器学习

2021-10-27

我们考虑了在自主移动机器人的视觉传感数据流中检测的问题，这些语义模式相对于机器人在类似环境中的先前经验而言是不寻常的（即异常）。这些异常可能表明危害不可预见，并且在失败昂贵的情况下，可以用来触发避免行为。我们贡献了在机器人勘探方案中获得的三个基于图像的新型数据集，其中包括超过200k的标记帧，涵盖了各种类型的异常。在这些数据集上，我们研究了基于以不同尺度运行的自动编码器的异常检测方法的性能。

translated by 谷歌翻译

Sampling-based optimal kinodynamic planning with motion primitives

Basak Sakcak , Luca Bascetta , Gianni Ferretti , Maria Prandini

分类：机器人

2018-09-07

本文提出了一个基于抽样的运动计划者，该计划将RRT*（迅速探索随机树星）集成到预计运动原始图的数据库中，以减轻其计算负载，并允许在动态或部分已知的环境中进行运动计划。该数据库是通过在某些网格空间中考虑一组初始状态和最终状态对来构建的，并确定每个对与系统动力学和约束兼容的最佳轨迹，同时最小化成本。通过在网格状态空间中提取样品并在数据库中选择将其连接到现有节点的数据库中的最佳无障碍运动原始性，将节点逐渐添加到RRT*算法中可行轨迹树中的节点。如果可以通过无障碍的运动原始的原始较低的成本从新的采样状态达到一些节点，则树将重新接线。因此，运动计划的计算更密集的部分被移至数据库构建的初步离线阶段（以网格造成的某些性能退化为代价。可以对网格分辨率进行调整，以便在数据库的最优性和大小之间妥协。由于网格分辨率为零，并且采样状态的数量增长到无穷大，因此规划器被证明是渐近的最佳选择。

translated by 谷歌翻译

Reinforcement Learning with Success Induced Task Prioritization

Maria Nesterova , Alexey Skrynnik , Aleksandr Panov

分类：机器学习 | 人工智能

2022-12-30

Many challenging reinforcement learning (RL) problems require designing a distribution of tasks that can be applied to train effective policies. This distribution of tasks can be specified by the curriculum. A curriculum is meant to improve the results of learning and accelerate it. We introduce Success Induced Task Prioritization (SITP), a framework for automatic curriculum learning, where a task sequence is created based on the success rate of each task. In this setting, each task is an algorithmically created environment instance with a unique configuration. The algorithm selects the order of tasks that provide the fastest learning for agents. The probability of selecting any of the tasks for the next stage of learning is determined by evaluating its performance score in previous stages. Experiments were carried out in the Partially Observable Grid Environment for Multiple Agents (POGEMA) and Procgen benchmark. We demonstrate that SITP matches or surpasses the results of other curriculum design methods. Our method can be implemented with handful of minor modifications to any standard RL framework and provides useful prioritization with minimal computational overhead.

translated by 谷歌翻译

Error syntax aware augmentation of feedback comment generation dataset

Nikolay Babakov , Maria Lysyuk , Alexander Shvets , Lilya Kazakova , Alexander Panchenko

分类：自然语言处理

2022-12-29

This paper presents a solution to the GenChal 2022 shared task dedicated to feedback comment generation for writing learning. In terms of this task given a text with an error and a span of the error, a system generates an explanatory note that helps the writer (language learner) to improve their writing skills. Our solution is based on fine-tuning the T5 model on the initial dataset augmented according to syntactical dependencies of the words located within indicated error span. The solution of our team "nigula" obtained second place according to manual evaluation by the organizers.

translated by 谷歌翻译

NeMo: 3D Neural Motion Fields from Multiple Video Instances of the Same Action

Kuan-Chieh Wang , Zhenzhen Weng , Maria Xenochristou , Joao Pedro Araujo , Jeffrey Gu , C. Karen Liu , Serena Yeung

分类：计算机视觉

2022-12-28

The task of reconstructing 3D human motion has wideranging applications. The gold standard Motion capture (MoCap) systems are accurate but inaccessible to the general public due to their cost, hardware and space constraints. In contrast, monocular human mesh recovery (HMR) methods are much more accessible than MoCap as they take single-view videos as inputs. Replacing the multi-view Mo- Cap systems with a monocular HMR method would break the current barriers to collecting accurate 3D motion thus making exciting applications like motion analysis and motiondriven animation accessible to the general public. However, performance of existing HMR methods degrade when the video contains challenging and dynamic motion that is not in existing MoCap datasets used for training. This reduces its appeal as dynamic motion is frequently the target in 3D motion recovery in the aforementioned applications. Our study aims to bridge the gap between monocular HMR and multi-view MoCap systems by leveraging information shared across multiple video instances of the same action. We introduce the Neural Motion (NeMo) field. It is optimized to represent the underlying 3D motions across a set of videos of the same action. Empirically, we show that NeMo can recover 3D motion in sports using videos from the Penn Action dataset, where NeMo outperforms existing HMR methods in terms of 2D keypoint detection. To further validate NeMo using 3D metrics, we collected a small MoCap dataset mimicking actions in Penn Action,and show that NeMo achieves better 3D reconstruction compared to various baselines.

translated by 谷歌翻译

Annealing Double-Head: An Architecture for Online Calibration of Deep Neural Networks

Erdong Guo , David Draper , Maria De Iorio

分类： (统计)机器学习 | 人工智能 | 计算机视觉 | 机器学习

2022-12-27

Model calibration, which is concerned with how frequently the model predicts correctly, not only plays a vital part in statistical model design, but also has substantial practical applications, such as optimal decision-making in the real world. However, it has been discovered that modern deep neural networks are generally poorly calibrated due to the overestimation (or underestimation) of predictive confidence, which is closely related to overfitting. In this paper, we propose Annealing Double-Head, a simple-to-implement but highly effective architecture for calibrating the DNN during training. To be precise, we construct an additional calibration head-a shallow neural network that typically has one latent layer-on top of the last latent layer in the normal model to map the logits to the aligned confidence. Furthermore, a simple Annealing technique that dynamically scales the logits by calibration head in training procedure is developed to improve its performance. Under both the in-distribution and distributional shift circumstances, we exhaustively evaluate our Annealing Double-Head architecture on multiple pairs of contemporary DNN architectures and vision and speech datasets. We demonstrate that our method achieves state-of-the-art model calibration performance without post-processing while simultaneously providing comparable predictive accuracy in comparison to other recently proposed calibration methods on a range of learning tasks.

translated by 谷歌翻译