在本文中,我们提出了一个名为“星际争霸多代理挑战”的新颖基准,代理商学习执行多阶段任务并使用没有精确奖励功能的环境因素。以前的挑战(SMAC)被认为是多名强化学习的标准基准,主要涉及确保所有代理人仅通过具有明显的奖励功能的精细操纵而合作消除接近对手。另一方面,这一挑战对MARL算法的探索能力有效地学习隐式多阶段任务和环境因素以及微控制感兴趣。这项研究涵盖了进攻和防御性场景。在进攻情况下,代理商必须学会先寻找对手,然后消除他们。防御性场景要求代理使用地形特征。例如,代理需要将自己定位在保护结构后面,以使敌人更难攻击。我们研究了SMAC+下的MARL算法,并观察到最近的方法在与以前的挑战类似,但在进攻情况下表现不佳。此外,我们观察到,增强的探索方法对性能有积极影响,但无法完全解决所有情况。这项研究提出了未来研究的新方向。
translated by 谷歌翻译
Intonations play an important role in delivering the intention of a speaker. However, current end-to-end TTS systems often fail to model proper intonations. To alleviate this problem, we propose a novel, intuitive method to synthesize speech in different intonations using predefined intonation templates. Prior to TTS model training, speech data are grouped into intonation templates in an unsupervised manner. Two proposed modules are added to the end-to-end TTS framework: an intonation predictor and an intonation encoder. The intonation predictor recommends a suitable intonation template to the given text. The intonation encoder, attached to the text encoder output, synthesizes speech abiding the requested intonation template. Main contributions of our paper are: (a) an easy-to-use intonation control system covering a wide range of users; (b) better performance in wrapping speech in a requested intonation with improved objective and subjective evaluation; and (c) incorporating a pre-trained language model for intonation modelling. Audio samples are available at https://srtts.github.io/IntoTTS.
translated by 谷歌翻译
这里,我们提出了一种新方法,在没有任何额外的平滑算法的模型预测路径积分控制(MPPI)任务中产生平滑控制序列。我们的方法有效地减轻了抽样中的喋喋不休,而MPPI的信息定位仍然是相同的。我们展示了具有不同算法的定量评估的挑战性自主驾驶任务中的提出方法。还提出了一种用于估算不同道路摩擦条件下的系统动态的神经网络车辆模型。我们的视频可以找到:\ url {https://youtu.be/o3nmi0ujfqg}。
translated by 谷歌翻译
在本文中,我们提出了自我监督的发言者表示学习策略,该策略包括在前端的引导平衡扬声器表示学习和在后端的不确定性意识的概率扬声器嵌入训练。在前端阶段,我们通过具有均匀性正则化术语的引导训练方案来学习扬声器表示。在后端阶段,通过最大化属于同一扬声器的语音样本之间的相互似然分数来估计概率扬声器嵌入,这不仅提供扬声器表示,而且提供数据不确定性。实验结果表明,拟议的举止均衡训练策略可以有效地帮助了解扬声器表示,并以基于对比学习的传统方法优越。此外,我们展示了集成的两级框架在eer和mindcf方面进一步改善了VoxceleB1测试中的扬声器验证性能。
translated by 谷歌翻译
Traversability estimation for mobile robots in off-road environments requires more than conventional semantic segmentation used in constrained environments like on-road conditions. Recently, approaches to learning a traversability estimation from past driving experiences in a self-supervised manner are arising as they can significantly reduce human labeling costs and labeling errors. However, the self-supervised data only provide supervision for the actually traversed regions, inducing epistemic uncertainty according to the scarcity of negative information. Negative data are rarely harvested as the system can be severely damaged while logging the data. To mitigate the uncertainty, we introduce a deep metric learning-based method to incorporate unlabeled data with a few positive and negative prototypes in order to leverage the uncertainty, which jointly learns using semantic segmentation and traversability regression. To firmly evaluate the proposed framework, we introduce a new evaluation metric that comprehensively evaluates the segmentation and regression. Additionally, we construct a driving dataset `Dtrail' in off-road environments with a mobile robot platform, which is composed of a wide variety of negative data. We examine our method on Dtrail as well as the publicly available SemanticKITTI dataset.
translated by 谷歌翻译
联合学习是一个隐私的模型,而无需通过转移模型揭示私人数据,而不是来自本地客户端设备的个人和私人数据。而在全球模型中,识别每个本地数据是正常的至关重要的。本文提出了一种方法,通过通过在本地模型中输入虚拟数据提取的矢量的欧几里得相似性聚类来分离正常的当地人和异常当地人。在联邦分类模型中,该方法将当地人分为正常和异常。
translated by 谷歌翻译
分布强化学习表明,具有差异和风险的特征,可用于探索的连续和离散控制设置中的最新性能。但是,尽管在分布RL中使用的许多勘探方法采用了每项操作的回报分配方差,但很难找到采用风险财产的勘探方法。在本文中,我们提出了风险调度方法,从风险的角度来看,探索风险水平和乐观行为。我们通过全面的实验在多代理设置中使用风险调度来证明DMIX算法的性能提高。
translated by 谷歌翻译
图形神经网络(GNNS)显着改善了图形结构数据的表示功率。尽管最近GNN的成功,大多数GNN的图表卷积都有两个限制。由于图形卷积在输入图上的小本地邻域中执行,因此固有地无法捕获距离节点之间的远程依赖性。另外,当节点具有属于不同类别的邻居时,即,异常,来自它们的聚合消息通常会影响表示学习。为了解决图表卷积的两个常见问题,在本文中,我们提出了可变形的图形卷积网络(可变形GCNS),可在多个潜在空间中自适应地执行卷积并捕获节点之间的短/远程依赖性。与节点表示(特征)分开,我们的框架同时学习节点位置嵌入式嵌入式(坐标)以确定节点之间以端到端的方式之间的关系。根据节点位置,卷积内核通过变形向量变形并将不同的变换应用于其邻居节点。我们广泛的实验表明,可变形的GCNS灵活地处理异常的处理,并在六个异化图数据集中实现节点分类任务中的最佳性能。
translated by 谷歌翻译
我们考虑了类增量学习(CIL)问题,其中学习代理人通过逐步到达的培训数据批次不断学习新课程,并旨在在迄今为止所学的所有课程中很好地预测。问题的主要挑战是灾难性的遗忘,对于基于典范的示例性记忆方法,通常众所周知,遗忘通常是由于分类评分偏差引起的,该分类得分偏差是由于新类和新类之间的数据失衡而注射的旧课(在示例记忆中)。尽管已经提出了几种方法来通过一些其他后处理(例如,得分重新缩放或平衡的微调)来纠正这种分数偏见,但没有对这种偏见的根本原因进行系统分析。为此,我们分析了通过组合所有旧类和新类的输出得分来计算SoftMax概率的主要原因。然后,我们提出了一种新方法,称为分离的软磁性学习(SS-IL),该方法由分离的SoftMax(SS)输出层组成,结合了任务知识蒸馏(TKD)来解决此类偏见。在几个大规模CIL基准数据集的广泛实验结果中,我们通过在没有任何其他后处理的情况下获得更加平衡的预测分数来表明我们的SS-IL实现了强大的最新准确性。
translated by 谷歌翻译
The 3D-aware image synthesis focuses on conserving spatial consistency besides generating high-resolution images with fine details. Recently, Neural Radiance Field (NeRF) has been introduced for synthesizing novel views with low computational cost and superior performance. While several works investigate a generative NeRF and show remarkable achievement, they cannot handle conditional and continuous feature manipulation in the generation procedure. In this work, we introduce a novel model, called Class-Continuous Conditional Generative NeRF ($\text{C}^{3}$G-NeRF), which can synthesize conditionally manipulated photorealistic 3D-consistent images by projecting conditional features to the generator and the discriminator. The proposed $\text{C}^{3}$G-NeRF is evaluated with three image datasets, AFHQ, CelebA, and Cars. As a result, our model shows strong 3D-consistency with fine details and smooth interpolation in conditional feature manipulation. For instance, $\text{C}^{3}$G-NeRF exhibits a Fr\'echet Inception Distance (FID) of 7.64 in 3D-aware face image synthesis with a $\text{128}^{2}$ resolution. Additionally, we provide FIDs of generated 3D-aware images of each class of the datasets as it is possible to synthesize class-conditional images with $\text{C}^{3}$G-NeRF.
translated by 谷歌翻译