机器人武器广泛用于自动行业。但是,随着在机器人臂中深入学习的广泛应用,存在新的挑战,例如分配掌握计算能力和对安全性的需求不断增长。在这项工作中,我们提出了一种基于深度学习和边缘云协作的机器人手臂抓握方法。这种方法意识到了机器人组的任意掌握计划,并考虑了掌握效率和信息安全性。此外,由GAN训练的编码器和解码器使图像在压缩时可以加密,从而确保隐私的安全性。该模型在OCID数据集上达到92%的精度,图像压缩比达到0.03%,结构差值高于0.91。
translated by 谷歌翻译
This paper proposes a novel kernel-based optimization scheme to handle tasks in the analysis, e.g., signal spectral estimation and single-channel source separation of 1D non-stationary oscillatory data. The key insight of our optimization scheme for reconstructing the time-frequency information is that when a nonparametric regression is applied on some input values, the output regressed points would lie near the oscillatory pattern of the oscillatory 1D signal only if these input values are a good approximation of the ground-truth phase function. In this work, Gaussian Process (GP) is chosen to conduct this nonparametric regression: the oscillatory pattern is encoded as the Pattern-inducing Points (PiPs) which act as the training data points in the GP regression; while the targeted phase function is fed in to compute the correlation kernels, acting as the testing input. Better approximated phase function generates more precise kernels, thus resulting in smaller optimization loss error when comparing the kernel-based regression output with the original signals. To the best of our knowledge, this is the first algorithm that can satisfactorily handle fully non-stationary oscillatory data, close and crossover frequencies, and general oscillatory patterns. Even in the example of a signal {produced by slow variation in the parameters of a trigonometric expansion}, we show that PiPs admits competitive or better performance in terms of accuracy and robustness than existing state-of-the-art algorithms.
translated by 谷歌翻译
联合学习(FL)是一个分布式的机器学习框架,可以减轻数据孤岛,在该筒仓中,分散的客户在不共享其私人数据的情况下协作学习全球模型。但是,客户的非独立且相同分布的(非IID)数据对训练有素的模型产生了负面影响,并且具有不同本地更新的客户可能会在每个通信回合中对本地梯度造成巨大差距。在本文中,我们提出了一种联合矢量平均(FedVeca)方法来解决上述非IID数据问题。具体而言,我们为与本地梯度相关的全球模型设定了一个新的目标。局部梯度定义为具有步长和方向的双向向量,其中步长为局部更新的数量,并且根据我们的定义将方向分为正和负。在FedVeca中,方向受步尺的影响,因此我们平均双向向量,以降低不同步骤尺寸的效果。然后,我们理论上分析了步骤大小与全球目标之间的关系,并在每个通信循环的步骤大小上获得上限。基于上限,我们为服务器和客户端设计了一种算法,以自适应调整使目标接近最佳的步骤大小。最后,我们通过构建原型系统对不同数据集,模型和场景进行实验,实验结果证明了FedVeca方法的有效性和效率。
translated by 谷歌翻译
班级增量学习(CIL)吸引了很多关注,但是大多数现有相关的作品都集中在微调整个表示模型上,这不可避免地导致了许多灾难性的遗忘。相比之下,使用语义丰富的预训练的表示模型,参数 - 辅助调整(PAT)仅更改很少的参数来学习新的视觉概念。最近的研究证明,基于PAT的CIL自然可以避免像大多数现有方法一样通过重播或蒸馏而战斗。但是,我们发现基于PAT的CIL仍然面临严重的语义漂移,这是由分类器学习偏见在不同学习阶段引起的问题,这大大降低了基于PAT的CIL的性能。为了解决这个问题,我们提出了增量原型调整(IPT),这是一种简单但有效的方法,它调整了分类和学习示例原型的类别原型,以补偿语义漂移。广泛的实验表明,我们的方法可以有效地补偿语义漂移。与经过良好训练的VIT骨架和其他PAT方法相结合,IPT超过了主流学习基准的最新基准。
translated by 谷歌翻译
能量收集(EH)间歇性地运行的IOT设备,与深神经网络(DNN)的进步相结合,为实现可持续智能应用开辟了新的机会。然而,由于有限的资源和间歇电源导致频繁故障的挑战,实现了EH设备上的那些计算和内存密集型智能算法非常困难。为了解决这些挑战,本文提出了一种方法,使得具有用于微小能量收集装置的低能量加速器的超快速深度学习。我们首先提出了一种资源感知结构化DNN训练框架,它采用块循环矩阵与ADMM实现高压缩和模型量化,以利用各种矢量操作加速器的优点。然后提出了一种DNN实现方法,即采用低能量加速器来利用具有较小能耗的最大性能的低能量加速器。最后,我们进一步设计Flex,系统支持在能量收集情况下间歇性计算。来自三种不同DNN模型的实验结果表明RAD,ACE和FLEX可以对能源收集设备进行超快速和正确的推断,该设备可降低高达4.26倍的运行时间,高达7.7倍的能量降低,高精度在最高的状态下艺术。
translated by 谷歌翻译
联合学习(FL)以来已提议已应用于许多领域,例如信用评估,医疗等。由于网络或计算资源的差异,客户端可能不会同时更新其渐变可能需要花费等待或闲置的时间。这就是为什么需要异步联合学习(AFL)方法。AFL中的主要瓶颈是沟通。如何在模型性能和通信成本之间找到平衡是AFL的挑战。本文提出了一种新的AFL框架VAFL。我们通过足够的实验验证了算法的性能。实验表明,VAFL可以通过48.23 \%的平均通信压缩速率降低约51.02 \%的通信时间,并允许模型更快地收敛。代码可用于\ url {https://github.com/robai-lab/vafl}
translated by 谷歌翻译
Few Shot Instance Segmentation (FSIS) requires models to detect and segment novel classes with limited several support examples. In this work, we explore a simple yet unified solution for FSIS as well as its incremental variants, and introduce a new framework named Reference Twice (RefT) to fully explore the relationship between support/query features based on a Transformer-like framework. Our key insights are two folds: Firstly, with the aid of support masks, we can generate dynamic class centers more appropriately to re-weight query features. Secondly, we find that support object queries have already encoded key factors after base training. In this way, the query features can be enhanced twice from two aspects, i.e., feature-level and instance-level. In particular, we firstly design a mask-based dynamic weighting module to enhance support features and then propose to link object queries for better calibration via cross-attention. After the above steps, the novel classes can be improved significantly over our strong baseline. Additionally, our new framework can be easily extended to incremental FSIS with minor modification. When benchmarking results on the COCO dataset for FSIS, gFSIS, and iFSIS settings, our method achieves a competitive performance compared to existing approaches across different shots, e.g., we boost nAP by noticeable +8.2/+9.4 over the current state-of-the-art FSIS method for 10/30-shot. We further demonstrate the superiority of our approach on Few Shot Object Detection. Code and model will be available.
translated by 谷歌翻译
In this chapter, we review and discuss the transformation of AI technology in HCI/UX work and assess how AI technology will change how we do the work. We first discuss how AI can be used to enhance the result of user research and design evaluation. We then discuss how AI technology can be used to enhance HCI/UX design. Finally, we discuss how AI-enabled capabilities can improve UX when users interact with computing systems, applications, and services.
translated by 谷歌翻译
As one of the most important psychic stress reactions, micro-expressions (MEs), are spontaneous and transient facial expressions that can reveal the genuine emotions of human beings. Thus, recognizing MEs (MER) automatically is becoming increasingly crucial in the field of affective computing, and provides essential technical support in lie detection, psychological analysis and other areas. However, the lack of abundant ME data seriously restricts the development of cutting-edge data-driven MER models. Despite the recent efforts of several spontaneous ME datasets to alleviate this problem, it is still a tiny amount of work. To solve the problem of ME data hunger, we construct a dynamic spontaneous ME dataset with the largest current ME data scale, called DFME (Dynamic Facial Micro-expressions), which includes 7,526 well-labeled ME videos induced by 671 participants and annotated by more than 20 annotators throughout three years. Afterwards, we adopt four classical spatiotemporal feature learning models on DFME to perform MER experiments to objectively verify the validity of DFME dataset. In addition, we explore different solutions to the class imbalance and key-frame sequence sampling problems in dynamic MER respectively on DFME, so as to provide a valuable reference for future research. The comprehensive experimental results show that our DFME dataset can facilitate the research of automatic MER, and provide a new benchmark for MER. DFME will be published via https://mea-lab-421.github.io.
translated by 谷歌翻译
Face Anti-spoofing (FAS) is essential to secure face recognition systems from various physical attacks. However, recent research generally focuses on short-distance applications (i.e., phone unlocking) while lacking consideration of long-distance scenes (i.e., surveillance security checks). In order to promote relevant research and fill this gap in the community, we collect a large-scale Surveillance High-Fidelity Mask (SuHiFiMask) dataset captured under 40 surveillance scenes, which has 101 subjects from different age groups with 232 3D attacks (high-fidelity masks), 200 2D attacks (posters, portraits, and screens), and 2 adversarial attacks. In this scene, low image resolution and noise interference are new challenges faced in surveillance FAS. Together with the SuHiFiMask dataset, we propose a Contrastive Quality-Invariance Learning (CQIL) network to alleviate the performance degradation caused by image quality from three aspects: (1) An Image Quality Variable module (IQV) is introduced to recover image information associated with discrimination by combining the super-resolution network. (2) Using generated sample pairs to simulate quality variance distributions to help contrastive learning strategies obtain robust feature representation under quality variation. (3) A Separate Quality Network (SQN) is designed to learn discriminative features independent of image quality. Finally, a large number of experiments verify the quality of the SuHiFiMask dataset and the superiority of the proposed CQIL.
translated by 谷歌翻译