许多水下任务,例如电缆和折磨检查,搜索和救援,受益于强大的人类机器人互动(HRI)功能。随着基于视觉的水下HRI方法的最新进展,即使在任务期间,自动驾驶水下车辆(AUV)也可以与他们的人类伴侣进行交流。但是,这些相互作用通常需要积极参与,尤其是人类(例如,在互动过程中必须继续看机器人)。因此,AUV必须知道何时开始与人类伴侣互动,即人是否关注AUV。在本文中,我们为AUV提供了一个潜水员的注意估计框架,以自主检测潜水员的注意力,然后(如果需要)在潜水员方面进行导航和重新定位以启动交互。该框架的核心要素是一个深神经网络(称为datt-net),它利用潜水员的10个面部关键点之间的几何关系来确定其头部方向。我们的基础实验评估(使用看不见的数据)表明,所提出的Datt-Net架构可以以有希望的准确性来确定人类潜水员的注意力。我们的现实世界实验还确认了Datt-NET的功效,该实验可以实时推理,并使AUV可以将自己定位为AUV-Diver相互作用。
translated by 谷歌翻译
在本文中,我们提出了一个基于运动的机器人通信框架,该框架能够在自动水下车辆(AUV)和人类潜水员之间进行非语言交流。我们为AUV到AUV通信设计一种手势语言,可以通过观察对话的潜水员轻松理解与典型的射频,光或基于音频的AUV通信来理解。为了让AUV在视觉上从另一个AUV中理解一个手势,我们提出了一个深层网络(RRCommnet),该网络利用了自我发挥的机制来学会通过提取最大歧视性时空特征来学会识别每个消息。我们将该网络培训在不同的模拟和现实世界中。在模拟和闭水机器人试验中,我们的实验评估表明,所提出的RRCommnet体系结构能够在模拟数据上平均准确性为88-94%,在真实数据上平均准确性为88-94%(真实数据的平均精度为88-94%)取决于所使用的模型的版本)。此外,通过与人类参与者进行消息转录研究,我们还表明,人类可以理解所提出的语言,总体转录精度为88%。最后,我们讨论了嵌入式GPU硬件上rrCommnet的推理运行时,以便在现场的AUV上实时使用。
translated by 谷歌翻译
衍射深神经网络(D2NNS)定义了一个由空间工程的被动表面组成的全光计算框架,该框架通过调节传播光的幅度和/或相位来共同处理光学输入信息。衍射光学网络通过薄衍射量以光的速度来完成其计算任务,而无需任何外部计算能力,同时利用了光学的巨大并行性。证明了衍射网络以实现对象的全光分类并执行通用线性变换。在这里,我们首次证明了使用衍射网络的“延时”图像分类方案,通过使用输入对象的横向运动和/或衍射网络,可以显着提高其在复杂输入对象上的分类准确性和概括性性能。 ,相对于彼此。在不同的上下文中,通常将对象和/或相机的相对运动用于图像超分辨率应用程序;受其成功的启发,我们设计了一个延时衍射网络,以受益于由受控或随机横向移动创建的互补信息内容。我们从数值探索了延时衍射网络的设计空间和性能限制,从CIFAR-10数据集的对象进行光学分类中揭示了62.03%的盲测精度。这构成了迄今使用CIFAR-10数据集上的单个衍射网络达到的最高推理精度。延时衍射网络将对使用全光处理器的输入信号的时空分析广泛有用。
translated by 谷歌翻译
灵活的任务计划继续对机器人构成艰巨的挑战,在这种机器人中,机器人无法创造性地将其任务计划改编成新的或看不见的问题,这主要是由于它对其行动和世界的知识有限。通过人类适应能力的激励,我们探索了如何从知识图(称为功能对象的网络(FOON))中获得的任务计划,可以用于针对需要在其知识库中不容易获得机器人可用概念的新型问题的新问题。来自140种烹饪食谱的知识是在FOON知识图中构造的,该图用于获取称为任务树的任务计划序列。可以修改任务树以以Foon知识图格式复制配方,这对于通过依靠语义相似性来丰富FOON的新食谱很有用。我们演示了任务树生成的力量,可以在食谱中从食谱中看到的1M+数据集中的食谱中看到,从未见过的成分和状态组合创建任务树的功能,我们根据它们的精确描述了新添加的成分的方式来评估树的质量。我们的实验结果表明,我们的系统能够提供76%正确性的任务序列。
translated by 谷歌翻译
开发智能和自治机器人的主要组成部分是一个合适的知识表示,从中机器人可以获得有关其行为或世界的知识。然而,与人类不同,机器人不能创造性地适应新颖的情景,因为他们的知识和环境严格定义。为了解决叫做任务树的新颖和灵活的任务计划的问题,我们探讨我们如何通过最初在机器人知识库中获得概念的计划。知识图形形式的现有知识用作引用的基本,以创建以新对象或状态组合修改的任务树。为了展示我们方法的灵活性,我们从Recipe1M + DataSet中随机选择了食谱并生成了其任务树。然后用可视化工具彻底检查任务树,该工具描绘了每个成分如何随着每个动作而改变以产生所需的膳食。我们的结果表明,即使对于从未出现之前的成分组合,该方法也可以以高精度生产任务计划。
translated by 谷歌翻译
The latent space of autoencoders has been improved for clustering image data by jointly learning a t-distributed embedding with a clustering algorithm inspired by the neighborhood embedding concept proposed for data visualization. However, multivariate tabular data pose different challenges in representation learning than image data, where traditional machine learning is often superior to deep tabular data learning. In this paper, we address the challenges of learning tabular data in contrast to image data and present a novel Gaussian Cluster Embedding in Autoencoder Latent Space (G-CEALS) algorithm by replacing t-distributions with multivariate Gaussian clusters. Unlike current methods, the proposed approach independently defines the Gaussian embedding and the target cluster distribution to accommodate any clustering algorithm in representation learning. A trained G-CEALS model extracts a quality embedding for unseen test data. Based on the embedding clustering accuracy, the average rank of the proposed G-CEALS method is 1.4 (0.7), which is superior to all eight baseline clustering and cluster embedding methods on seven tabular data sets. This paper shows one of the first algorithms to jointly learn embedding and clustering to improve multivariate tabular data representation in downstream clustering.
translated by 谷歌翻译
Deep learning methods in the literature are invariably benchmarked on image data sets and then assumed to work on all data problems. Unfortunately, architectures designed for image learning are often not ready or optimal for non-image data without considering data-specific learning requirements. In this paper, we take a data-centric view to argue that deep image embedding clustering methods are not equally effective on heterogeneous tabular data sets. This paper performs one of the first studies on deep embedding clustering of seven tabular data sets using six state-of-the-art baseline methods proposed for image data sets. Our results reveal that the traditional clustering of tabular data ranks second out of eight methods and is superior to most deep embedding clustering baselines. Our observation is in line with the recent literature that traditional machine learning of tabular data is still a competitive approach against deep learning. Although surprising to many deep learning researchers, traditional clustering methods can be competitive baselines for tabular data, and outperforming these baselines remains a challenge for deep embedding clustering. Therefore, deep learning methods for image learning may not be fair or suitable baselines for tabular data without considering data-specific contrasts and learning requirements.
translated by 谷歌翻译
User-specific future activity prediction in the healthcare domain based on previous activities can drastically improve the services provided by the nurses. It is challenging because, unlike other domains, activities in healthcare involve both nurses and patients, and they also vary from hour to hour. In this paper, we employ various data processing techniques to organize and modify the data structure and an LSTM-based multi-label classifier for a novel 2-stage training approach (user-agnostic pre-training and user-specific fine-tuning). Our experiment achieves a validation accuracy of 31.58\%, precision 57.94%, recall 68.31%, and F1 score 60.38%. We concluded that proper data pre-processing and a 2-stage training process resulted in better performance. This experiment is a part of the "Fourth Nurse Care Activity Recognition Challenge" by our team "Not A Fan of Local Minima".
translated by 谷歌翻译
Skeleton-based Motion Capture (MoCap) systems have been widely used in the game and film industry for mimicking complex human actions for a long time. MoCap data has also proved its effectiveness in human activity recognition tasks. However, it is a quite challenging task for smaller datasets. The lack of such data for industrial activities further adds to the difficulties. In this work, we have proposed an ensemble-based machine learning methodology that is targeted to work better on MoCap datasets. The experiments have been performed on the MoCap data given in the Bento Packaging Activity Recognition Challenge 2021. Bento is a Japanese word that resembles lunch-box. Upon processing the raw MoCap data at first, we have achieved an astonishing accuracy of 98% on 10-fold Cross-Validation and 82% on Leave-One-Out-Cross-Validation by using the proposed ensemble model.
translated by 谷歌翻译
Uncertainty quantification (UQ) has increasing importance in building robust high-performance and generalizable materials property prediction models. It can also be used in active learning to train better models by focusing on getting new training data from uncertain regions. There are several categories of UQ methods each considering different types of uncertainty sources. Here we conduct a comprehensive evaluation on the UQ methods for graph neural network based materials property prediction and evaluate how they truly reflect the uncertainty that we want in error bound estimation or active learning. Our experimental results over four crystal materials datasets (including formation energy, adsorption energy, total energy, and band gap properties) show that the popular ensemble methods for uncertainty estimation is NOT the best choice for UQ in materials property prediction. For the convenience of the community, all the source code and data sets can be accessed freely at \url{https://github.com/usccolumbia/materialsUQ}.
translated by 谷歌翻译