智能论文笔记

Arena-Bench: A Benchmarking Suite for Obstacle Avoidance Approaches in Highly Dynamic Environments

Linh Kästner , Teham Bhuiyan , Tuan Anh Le , Elias Treis , Johannes Cox , Boris Meinardus , Jacek Kmiecik , Reyk Carstens , Duc Pichel , Bassel Fatloun

分类：机器人

2022-06-12

对于移动机器人来说，自主行驶安全性的能力，尤其是在动态环境中的能力至关重要。近年来，DRL方法在避免动态障碍物方面表现出了出色的表现。但是，这些基于学习的方法通常是在专门设计的仿真环境中开发的，并且很难针对传统的计划方法进行测试。此外，这些方法将这些方法的集成和部署到真正的机器人平台中尚未完全解决。在本文中，我们介绍了Arena-Bench，这是一套基准套件，可在3D环境中在不同机器人平台上进行训练，测试和评估导航计划者。它提供了设计和生成高度动态评估世界，场景和自动导航任务的工具，并已完全集成到机器人操作系统中。为了展示我们套件的功能，我们在平台上培训了DRL代理，并将其与各种相关指标上的各种现有基于模型和学习的导航方法进行了比较。最后，我们将方法部署到了真实的机器人方面，并证明了结果的可重复性。该代码可在github.com/ignc-research/arena-bench上公开获得。

translated by 谷歌翻译

Front-door Adjustment via Style Transfer for Out-of-distribution Generalisation

Toan Nguyen , Kien Do , Duc Thanh Nguyen , Bao Duong , Thin Nguyen

分类：计算机视觉 | 人工智能

2022-12-06

Out-of-distribution (OOD) generalisation aims to build a model that can well generalise its learnt knowledge from source domains to an unseen target domain. However, current image classification models often perform poorly in the OOD setting due to statistically spurious correlations learning from model training. From causality-based perspective, we formulate the data generation process in OOD image classification using a causal graph. On this graph, we show that prediction P(Y|X) of a label Y given an image X in statistical learning is formed by both causal effect P(Y|do(X)) and spurious effects caused by confounding features (e.g., background). Since the spurious features are domain-variant, the prediction P(Y|X) becomes unstable on unseen domains. In this paper, we propose to mitigate the spurious effect of confounders using front-door adjustment. In our method, the mediator variable is hypothesized as semantic features that are essential to determine a label for an image. Inspired by capability of style transfer in image generation, we interpret the combination of the mediator variable with different generated images in the front-door formula and propose novel algorithms to estimate it. Extensive experimental results on widely used benchmark datasets verify the effectiveness of our method.

translated by 谷歌翻译

Massively Multilingual ASR on 70 Languages: Tokenization, Architecture, and Generalization Capabilities

Andros Tjandra , Nayan Singhal , David Zhang , Ozlem Kalinli , Abdelrahman Mohamed , Duc Le , Michael L. Seltzer

分类：自然语言处理

2022-11-10

End-to-end multilingual ASR has become more appealing because of several reasons such as simplifying the training and deployment process and positive performance transfer from high-resource to low-resource languages. However, scaling up the number of languages, total hours, and number of unique tokens is not a trivial task. This paper explores large-scale multilingual ASR models on 70 languages. We inspect two architectures: (1) Shared embedding and output and (2) Multiple embedding and output model. In the shared model experiments, we show the importance of tokenization strategy across different languages. Later, we use our optimal tokenization strategy to train multiple embedding and output model to further improve our result. Our multilingual ASR achieves 13.9%-15.6% average WER relative improvement compared to monolingual models. We show that our multilingual ASR generalizes well on an unseen dataset and domain, achieving 9.5% and 7.5% WER on Multilingual Librispeech (MLS) with zero-shot and finetuning, respectively.

translated by 谷歌翻译

SG-Shuffle: Multi-aspect Shuffle Transformer for Scene Graph Generation

Anh Duc Bui , Soyeon Caren Han , Josiah Poon

分类：计算机视觉

2022-11-09

Scene Graph Generation (SGG) serves a comprehensive representation of the images for human understanding as well as visual understanding tasks. Due to the long tail bias problem of the object and predicate labels in the available annotated data, the scene graph generated from current methodologies can be biased toward common, non-informative relationship labels. Relationship can sometimes be non-mutually exclusive, which can be described from multiple perspectives like geometrical relationships or semantic relationships, making it even more challenging to predict the most suitable relationship label. In this work, we proposed the SG-Shuffle pipeline for scene graph generation with 3 components: 1) Parallel Transformer Encoder, which learns to predict object relationships in a more exclusive manner by grouping relationship labels into groups of similar purpose; 2) Shuffle Transformer, which learns to select the final relationship labels from the category-specific feature generated in the previous step; and 3) Weighted CE loss, used to alleviate the training bias caused by the imbalanced dataset.

translated by 谷歌翻译

Online pseudo labeling for polyp segmentation with momentum networks

Toan Pham Van , Linh Bao Doan , Thanh Tung Nguyen , Duc Trung Tran , Quan Van Nguyen , Dinh Viet Sang

分类：计算机视觉

2022-09-29

语义分割是开发医学图像诊断系统的重要任务。但是，构建注释的医疗数据集很昂贵。因此，在这种情况下，半监督方法很重要。在半监督学习中，标签的质量在模型性能中起着至关重要的作用。在这项工作中，我们提出了一种新的伪标签策略，可提高用于培训学生网络的伪标签的质量。我们遵循多阶段的半监督训练方法，该方法在标记的数据集上训练教师模型，然后使用训练有素的老师将伪标签渲染用于学生培训。通过这样做，伪标签将被更新，并且随着培训的进度更加精确。上一个和我们的方法之间的关键区别在于，我们在学生培训过程中更新教师模型。因此，在学生培训过程中，提高了伪标签的质量。我们还提出了一种简单但有效的策略，以使用动量模型来提高伪标签的质量 - 训练过程中原始模型的慢复制版本。通过应用动量模型与学生培训期间的重新渲染伪标签相结合，我们在五个数据集中平均达到了84.1％的骰子分数（即Kvarsir，CVC-ClinicdB，Etis-laribpolypdb，cvc-colondb，cvc-colondb，cvc-colondb和cvc-300）和CVC-300）只有20％的数据集用作标记数据。我们的结果超过了3％的共同实践，甚至在某些数据集中取得了完全监督的结果。我们的源代码和预培训模型可在https://github.com/sun-asterisk-research/online学习SSL上找到

translated by 谷歌翻译

FedToken: Tokenized Incentives for Data Contribution in Federated Learning

Shashi Raj Pandey , Lam Duc Nguyen , Petar Popovski

分类：机器学习

2022-09-20

弥补联邦学习（FL）模型的分散培训中所涉及的成本的激励措施是客户长期参与的关键刺激。但是，由于缺乏以下信息，请说服客户在FL上进行质量参与：（i）有关客户数据质量和属性的完整信息；（ii）客户数据贡献的价值；（iii）货币奖励优惠的可信赖机制。这通常会导致培训和沟通效率较差。尽管有几项工作着重于战略激励设计和客户选择以克服这个问题，但就针对预见的数字经济（包括Web 3.0）量身定制的总体设计存在一个重大的知识差距，同时同时实现了学习目标。为了解决这一差距，我们提出了一个基于贡献的令牌化激励方案，即\ texttt {fedToken}，并得到区块链技术的支持，可确保在模型培训期间与其数据估值相对应的客户之间的公平分配。利用工程设计的基于Shapley的计划，我们首先近似模型聚合过程中本地模型的贡献，然后战略性地安排客户降低沟通循环的融合和锚定方式，以分配\ emph {负担得起的}代币在受限的货币预算下。广泛的模拟证明了我们提出的方法的功效。

translated by 谷歌翻译

Learning ASR pathways: A sparse multilingual ASR model

Mu Yang , Andros Tjandra , Chunxi Liu , David Zhang , Duc Le , John H. L. Hansen , Ozlem Kalinli

分类：自然语言处理

2022-09-13

神经网络修剪可以有效地用于压缩自动语音识别（ASR）模型。但是，在多语言ASR中，执行语言不足的修剪可能会导致某些语言的严重性能降解，因为语言 - 敏捷的修剪口罩可能不符合所有语言，并丢弃了重要的语言特定参数。在这项工作中，我们提出了ASR路径，这是一种稀疏的多语言ASR模型，该模型激活了特定语言的子网络（“路径”），从而明确地学习了每种语言的参数。通过重叠的子网络，共享参数还可以通过联合多语言培训来实现较低资源语言的知识传输。我们提出了一种新型算法来学习ASR途径，并通过流式RNN-T模型评估了4种语言的建议方法。我们提出的ASR途径的表现都优于密集模型（平均-5.0％）和语言不足的修剪模型（平均-21.4％），并且与单语稀疏模型相比，低资源语言的性能更好。

translated by 谷歌翻译

Rare but Severe Neural Machine Translation Errors Induced by Minimal Deletion: An Empirical Study on Chinese and English

Ruikang Shi , Alvin Grissom II , Duc Minh Trinh

分类：自然语言处理 | 人工智能 | 机器学习

2022-09-05

我们通过用基于字符的模型对源文本的最小删除来检查英语 - 英语和中文 - 英语内神经机器翻译中罕见但严重的错误的诱导。通过删除单个字符，我们发现我们可以在翻译中引起严重的错误。我们对这些错误进行分类，并比较删除单个字符和单词的结果。我们还研究了训练数据大小对这些最小扰动引起的病理病例的数量和类型的影响，从而发现了显着差异。

translated by 谷歌翻译

Orthogonal Gated Recurrent Unit with Neumann-Cayley Transformation

Edison Mucllari , Vasily Zadorozhnyy , Cole Pospisil , Duc Nguyen , Qiang Ye

分类：机器学习

2022-08-12

近年来，使用正交矩阵已被证明是通过训练，稳定性和收敛尤其是控制梯度来改善复发性神经网络（RNN）的一种有希望的方法。通过使用各种门和记忆单元，封闭的复发单元（GRU）和长期短期记忆（LSTM）体系结构解决了消失的梯度问题，但它们仍然容易出现爆炸梯度问题。在这项工作中，我们分析了GRU中的梯度，并提出了正交矩阵的使用，以防止梯度问题爆炸并增强长期记忆。我们研究了在哪里使用正交矩阵，并提出了基于Neumann系列的缩放尺度的Cayley转换，以训练GRU中的正交矩阵，我们称之为Neumann-cayley Orthoconal orthoconal Gru或简单的NC-GRU。我们介绍了有关几个合成和现实世界任务的模型的详细实验，这些实验表明NC-GRU明显优于GRU以及其他几个RNN。

translated by 谷歌翻译

FedDRL: Deep Reinforcement Learning-based Adaptive Aggregation for Non-IID Data in Federated Learning

Nang Hung Nguyen , Phi Le Nguyen , Duc Long Nguyen , Trung Thanh Nguyen , Thuy Dung Nguyen , Huy Hieu Pham , Truong Thao Nguyen

分类：机器学习 | 计算机视觉

2022-08-04

跨不同边缘设备（客户）局部数据的分布不均匀，导致模型训练缓慢，并降低了联合学习的准确性。幼稚的联合学习（FL）策略和大多数替代解决方案试图通过加权跨客户的深度学习模型来实现更多公平。这项工作介绍了在现实世界数据集中遇到的一种新颖的非IID类型，即集群键，其中客户组具有具有相似分布的本地数据，从而导致全局模型收敛到过度拟合的解决方案。为了处理非IID数据，尤其是群集串数据的数据，我们提出了FedDrl，这是一种新型的FL模型，它采用了深厚的强化学习来适应每个客户的影响因素（将用作聚合过程中的权重）。在一组联合数据集上进行了广泛的实验证实，拟议的FEDDR可以根据CIFAR-100数据集的平均平均为FedAvg和FedProx方法提高了有利的改进，例如，高达4.05％和2.17％。

translated by 谷歌翻译