超过人类决策能力的机器学习模型的出现,在复杂的领域中启动了一种运动,以构建与人类互动的AI系统。许多构建基础对于这项活动至关重要,中心是人类行为的算法表征。尽管现有的大部分工作都集中在人类的总体行为上,但一个重要的远程目标是开发专门针对个人人并可以在其中区分的行为模型。为了使这个过程形式化,我们研究了行为风格的问题,其中任务是仅从决策中确定决策者。我们提出了一种基于变压器的方法,用于在国际象棋的背景下进行行为风格测量法,其中有人试图识别玩一组游戏的玩家。我们的方法在几个弹药的分类框架中运行,并且可以在只有100个标签游戏的情况下正确地从成千上万的候选玩家中识别出98%精度的候选人。即使接受业余比赛的训练,我们的方法还是对大师级玩家的分布样本的概括,尽管业余球员和世界一流的球员之间存在巨大差异。最后,我们更广泛地考虑了我们所产生的嵌入有关国际象棋中人类风格的揭示的内容,以及在行为数据中识别个人的强大方法的潜在伦理含义。
translated by 谷歌翻译
人工智能研究中的一个新兴主题是创建模型,以模拟特定人员的决策和行为,包括游戏玩法,文本生成和艺术表达。这些模型以对个人的量身定制的方式以及为互动而不是简单地繁殖固定的预计行为的复制方式而超越了早期的方法。我们将这些称为模拟模型,在本文中,我们开发了一个框架,以表征其日益增长的可用性所带来的道德和社会问题。我们的框架包括用于使用此类模型的许多不同方案,并考虑了对一系列不同参与者的影响,包括正在建模的目标,部署模型的操作员以及与之交互的实体。
translated by 谷歌翻译
在人类可能希望从这些系统中学习,与它们合作或作为合作伙伴互动的情况下,可以捕获类似人类行为的AI系统越来越有用。为了开发以人为导向的AI系统,预测人类行为(而不是预测最佳行动)的问题受到了广泛关注。现有的工作集中在总体意义上捕获人类行为,这可能会限制任何特定个人可以从与这些系统互动中获得的收益。我们通过开发国际象棋中人类行为的高度准确的预测模型来扩展这一工作。国际象棋是探索人类互动的一个丰富领域,因为它结合了一套独特的属性:AI系统在多年前实现了超人类的表现,但人类仍然与他们以及对手和准备工具紧密互动,并且有一种关于单个玩家游戏的大量记录数据。从迈亚(Maia)开始,该版本的Alphazero经过了对人类人群的培训,我们证明我们可以通过应用一系列微调方法来显着提高特定玩家的举动的预测准确性。此外,我们的个性化模型可用于执行风格测定法 - 预测谁采取了一组给定的动作 - 表明他们在个人层面上捕获了人类的决策。我们的工作展示了一种使AI系统更好地与个人行为保持一致的方法,这可能会导致人类互动的大量改善。
translated by 谷歌翻译
While prior work has established that the use of parallel data is conducive for cross-lingual learning, it is unclear if the improvements come from the data itself, or if it is the modeling of parallel interactions that matters. Exploring this, we examine the usage of unsupervised machine translation to generate synthetic parallel data, and compare it to supervised machine translation and gold parallel data. We find that even model generated parallel data can be useful for downstream tasks, in both a general setting (continued pretraining) as well as the task-specific setting (translate-train), although our best results are still obtained using real parallel data. Our findings suggest that existing multilingual models do not exploit the full potential of monolingual data, and prompt the community to reconsider the traditional categorization of cross-lingual learning approaches.
translated by 谷歌翻译
Logical reasoning of text is an important ability that requires understanding the information present in the text, their interconnections, and then reasoning through them to infer new conclusions. Prior works on improving the logical reasoning ability of language models require complex processing of training data (e.g., aligning symbolic knowledge to text), yielding task-specific data augmentation solutions that restrict the learning of general logical reasoning skills. In this work, we propose APOLLO, an adaptively pretrained language model that has improved logical reasoning abilities. We select a subset of Wikipedia, based on a set of logical inference keywords, for continued pretraining of a language model. We use two self-supervised loss functions: a modified masked language modeling loss where only specific parts-of-speech words, that would likely require more reasoning than basic language understanding, are masked, and a sentence-level classification loss that teaches the model to distinguish between entailment and contradiction types of sentences. The proposed training paradigm is both simple and independent of task formats. We demonstrate the effectiveness of APOLLO by comparing it with prior baselines on two logical reasoning datasets. APOLLO performs comparably on ReClor and outperforms baselines on LogiQA.
translated by 谷歌翻译
Autonomous vehicles are being deployed with a spectrum of capability, extending from driver assistance features for the highway in personal vehicles (SAE Level 2+) to fully autonomous fleet ride sharing services operating in complex city environments (SAE Level 4+). This spectrum of autonomy often operates in different physical environments with different degrees of assumed driver in-the-loop oversight and hence have very different system and subsystem requirements. At the heart of SAE Level 2 to 5 systems is localization and mapping, which ranges from road determination for feature geofencing or high-level routing, through lane determination for advanced driver assistance, to where-in-lane positioning for full vehicle control. We assess localization and mapping requirements for different levels of autonomy and supported features. This work provides a framework for system decomposition, including the level of redundancy needed to achieve the target level of safety. We examine several representative autonomous and assistance features and make recommendations on positioning requirements as well map georeferencing and information integrity.
translated by 谷歌翻译
Self-supervised monocular depth estimation has shown impressive results in static scenes. It relies on the multi-view consistency assumption for training networks, however, that is violated in dynamic object regions and occlusions. Consequently, existing methods show poor accuracy in dynamic scenes, and the estimated depth map is blurred at object boundaries because they are usually occluded in other training views. In this paper, we propose SC-DepthV3 for addressing the challenges. Specifically, we introduce an external pretrained monocular depth estimation model for generating single-image depth prior, namely pseudo-depth, based on which we propose novel losses to boost self-supervised training. As a result, our model can predict sharp and accurate depth maps, even when training from monocular videos of highly-dynamic scenes. We demonstrate the significantly superior performance of our method over previous methods on six challenging datasets, and we provide detailed ablation studies for the proposed terms. Source code and data will be released at https://github.com/JiawangBian/sc_depth_pl
translated by 谷歌翻译
Systems for person re-identification (ReID) can achieve a high accuracy when trained on large fully-labeled image datasets. However, the domain shift typically associated with diverse operational capture conditions (e.g., camera viewpoints and lighting) may translate to a significant decline in performance. This paper focuses on unsupervised domain adaptation (UDA) for video-based ReID - a relevant scenario that is less explored in the literature. In this scenario, the ReID model must adapt to a complex target domain defined by a network of diverse video cameras based on tracklet information. State-of-art methods cluster unlabeled target data, yet domain shifts across target cameras (sub-domains) can lead to poor initialization of clustering methods that propagates noise across epochs, thus preventing the ReID model to accurately associate samples of same identity. In this paper, an UDA method is introduced for video person ReID that leverages knowledge on video tracklets, and on the distribution of frames captured over target cameras to improve the performance of CNN backbones trained using pseudo-labels. Our method relies on an adversarial approach, where a camera-discriminator network is introduced to extract discriminant camera-independent representations, facilitating the subsequent clustering. In addition, a weighted contrastive loss is proposed to leverage the confidence of clusters, and mitigate the risk of incorrect identity associations. Experimental results obtained on three challenging video-based person ReID datasets - PRID2011, iLIDS-VID, and MARS - indicate that our proposed method can outperform related state-of-the-art methods. Our code is available at: \url{https://github.com/dmekhazni/CAWCL-ReID}
translated by 谷歌翻译
Human and robot partners increasingly need to work together to perform tasks as a team. Robots designed for such collaboration must reason about how their task-completion strategies interplay with the behavior and skills of their human team members as they coordinate on achieving joint goals. Our goal in this work is to develop a computational framework for robot adaptation to human partners in human-robot team collaborations. We first present an algorithm for autonomously recognizing available task-completion strategies by observing human-human teams performing a collaborative task. By transforming team actions into low dimensional representations using hidden Markov models, we can identify strategies without prior knowledge. Robot policies are learned on each of the identified strategies to construct a Mixture-of-Experts model that adapts to the task strategies of unseen human partners. We evaluate our model on a collaborative cooking task using an Overcooked simulator. Results of an online user study with 125 participants demonstrate that our framework improves the task performance and collaborative fluency of human-agent teams, as compared to state of the art reinforcement learning methods.
translated by 谷歌翻译
事件传感是生物启发的飞行指导和控制系统中的主要组成部分。我们探讨了事件摄像机在腹侧着陆期间与表面进行时间接触(TTC)的用法。这是通过估计差异(逆TTC)的差异来实现的,即径向光流的速率,是从着陆期间产生的事件流。我们的核心贡献是针对基于事件的差异估计的一种新颖的对比度最大化公式,以及一种分支和结合算法,可准确地最大化对比度并找到最佳的差异值。进行GPU加速度以加快全球算法。另一个贡献是一个新的数据集,其中包含来自腹面着陆的真实事件流,该数据集用于测试和基准我们的方法。由于全局优化,与其他启发式差异估计器或基于事件的光流方法相比,我们的算法更有能力恢复真正的分歧。随着GPU加速,我们的方法还可以实现竞争性的运行时间。
translated by 谷歌翻译