智能论文笔记

Exploiting High Quality Tactile Sensors for Simplified Grasping

Pedro Machado , T. M. McGinnity

分类：机器人 | 计算机视觉

2022-07-25

预计机器人将掌握形状，重量或材料类型各不相同的广泛物体。因此，为机器人提供类似于人类的触觉功能对于涉及人与人机或机器人与机器人相互作用的应用至关重要，尤其是在那些期望机器人掌握和操纵以前未遇到的复杂物体的情况下。成功的对象掌握和操纵的关键方面是使用配备多个高性能传感器的高质量指尖，在特定的接触表面上适当分布。在本文中，我们介绍了使用两种不同类型的市售机器人指尖（Biotac和wts-ft）的使用的详细分析，每个机器人指尖（Biotac和wts-ft）配备了分布在指尖的接触表面上的多个传感器。我们进一步证明，由于指尖的高性能，不需要一种复杂的自适应抓握算法来抓住日常物体。我们得出的结论是，只要相关的指尖表现出较高的灵敏度，基于比例控制器的简单算法就足够了。在量化的评估中，我们还证明，部分由于传感器的分布，基于BioTAC的指尖的性能优于WTS-FT设备，可以使负载升高至850G，并且简单的比例控制器可以适应该载荷即使对象面临重大的外部振动挑战，也要掌握。

translated by 谷歌翻译

Estimating the Power Consumption of Heterogeneous Devices when performing AI Inference

Pedro Machado , Ivica Matic , Francisco de Lemos , Isibor Kennedy Ihianle , David Ada Adama

分类：人工智能 | 计算机视觉

2022-07-13

现代生活是由连接到互联网的电子设备驱动的。新兴研究领域的新兴研究领域（IoT）已变得流行，就像连接设备数量稳定增加一样 - 现在超过500亿。由于这些设备中的许多用于执行\ gls*{cv}任务，因此必须了解其针对性能的功耗。我们在执行对象分类时报告了NVIDIA JETSON NANO板的功耗概况和分析。作者对使用Yolov5模型进行了有关每帧功耗和每秒（FPS）帧输出的广泛分析。结果表明，Yolov5N在吞吐量（即12.34 fps）和低功耗（即0.154 MWH/Frafe）方面优于其他Yolov5变体。

translated by 谷歌翻译

Real-Time Gesture Recognition with Virtual Glove Markers

Finlay McKinnon , David Ada Adama , Pedro Machado , Isibor Kennedy Ihianle

分类：计算机视觉 | 人工智能

2022-07-06

由于通用的非语言自然交流方法可以在人类之间进行有效的沟通，因此在过去的几十年中，手势识别技术一直在稳步发展。基于手势识别的研究文章中已经提出了许多不同的策略，以尝试创建一个有效的系统，以使用物理传感器和计算机视觉将非语言自然通信信息发送给计算机。另一方面，超准确的实时系统直到最近才开始占据研究领域，每种系统都由于过去的限制（例如可用性，成本，速度和准确性）而采用了一系列方法。提出了一种基于计算机视觉的人类计算机交互工具，用于充当自然用户界面的手势识别应用程序。用户手上的虚拟手套标记将被创建并用作深度学习模型的输入，以实时识别手势。获得的结果表明，拟议的系统将在实时应用中有效，包括通过远程依恋和康复进行社交互动。

translated by 谷歌翻译

Deep Learning approach for Classifying Trusses and Runners of Strawberries

Jakub Pomykala , Francisco de Lemos , Isibor Kennedy Ihianle , David Ada Adama , Pedro Machado

分类：计算机视觉 | 人工智能

2022-07-06

在农业部门中使用人工智能以快速增长，以使农业活动自动化。新兴的农业技术专注于植物，水果，疾病和土壤类型的映射和分类。尽管使用深度学习算法的辅助收获和修剪应用处于早期开发阶段，但仍需要解决此类过程的解决方案。本文建议使用深度学习将草莓植物的桁架和跑步者分类，并使用语义分割和数据集扩展分类。所提出的方法是基于使用噪声（即高斯，斑点，泊松和盐和辣椒）来人为地增强数据集并补偿数据样本数量少并增加整体分类性能。使用平均精度，召回和F1得分的平均值评估结果。提出的方法在精确度，召回和F1分别获得91 \％，95 \％和92 \％，用于使用resnet101进行桁架检测，并利用盐和辣椒噪声进行数据集增强；和83 \％，53 \％和65 \％的精度，召回和F1分别用于使用Poisson噪声的RESNET50进行桁架检测，用于桁架检测。

translated by 谷歌翻译

NeuroHSMD: Neuromorphic Hybrid Spiking Motion Detector

Pedro Machado , Eiman Kanjo , Andreas Oikonomou Ahmad Lotfi

分类：神经与进化计算 | 计算机视觉

2021-12-12

脊椎动物视网膜在加工琐碎的视觉任务中是高效的，例如检测移动物体，但是现代计算机的复杂任务。对象运动的检测由名为对象 - 运动敏感神经节细胞（OMS-GC）的专用视网膜神经节细胞完成。 OMS-GC处理连续信号并生成由Visual Cortex后处理的尖峰模式。本工作中提出的神经晶杂交尖峰运动检测器（NeurohSMD）使用现场可编程门阵列（FPGA）加速了HSMD算法。混合尖峰运动检测器（HSMD）算法是增强动态背景减法（DBS）算法的混合算法，其具有定制的3层尖峰神经网络（SNN），该扫描神经网络（SNN）产生OMS-GC Spiking的响应。将NeurokSmd算法与HSMD算法进行比较，使用相同的2012年改变检测（CDNET2012）和2014更改检测（CDNET2014）基准数据集。结果表明，NeurohSMD在实时生产与HSMD算法相同的结果，而不会降低质量。此外，本文提出的NeurokSMD以开放的计算机语言（OpenCL）完全实现，因此在其他设备中容易复制，例如图形处理器单元（GPU）和中央处理器单元（CPU）的集群。

translated by 谷歌翻译

Adversarial attacks and defenses on ML- and hardware-based IoT device fingerprinting and identification

Pedro Miguel Sánchez Sánchez , Alberto Huertas Celdrán , Gérôme Bovet , Gregorio Martínez Pérez

分类：人工智能

2022-12-30

In the last years, the number of IoT devices deployed has suffered an undoubted explosion, reaching the scale of billions. However, some new cybersecurity issues have appeared together with this development. Some of these issues are the deployment of unauthorized devices, malicious code modification, malware deployment, or vulnerability exploitation. This fact has motivated the requirement for new device identification mechanisms based on behavior monitoring. Besides, these solutions have recently leveraged Machine and Deep Learning techniques due to the advances in this field and the increase in processing capabilities. In contrast, attackers do not stay stalled and have developed adversarial attacks focused on context modification and ML/DL evaluation evasion applied to IoT device identification solutions. This work explores the performance of hardware behavior-based individual device identification, how it is affected by possible context- and ML/DL-focused attacks, and how its resilience can be improved using defense techniques. In this sense, it proposes an LSTM-CNN architecture based on hardware performance behavior for individual device identification. Then, previous techniques have been compared with the proposed architecture using a hardware performance dataset collected from 45 Raspberry Pi devices running identical software. The LSTM-CNN improves previous solutions achieving a +0.96 average F1-Score and 0.8 minimum TPR for all devices. Afterward, context- and ML/DL-focused adversarial attacks were applied against the previous model to test its robustness. A temperature-based context attack was not able to disrupt the identification. However, some ML/DL state-of-the-art evasion attacks were successful. Finally, adversarial training and model distillation defense techniques are selected to improve the model resilience to evasion attacks, without degrading its performance.

translated by 谷歌翻译

RL and Fingerprinting to Select Moving Target Defense Mechanisms for Zero-day Attacks in IoT

Alberto Huertas Celdrán , Pedro Miguel Sánchez Sánchez , Jan von der Assen , Timo Schenk , Gérôme Bovet , Gregorio Martínez Pérez , Burkhard Stiller

分类：人工智能

2022-12-30

Cybercriminals are moving towards zero-day attacks affecting resource-constrained devices such as single-board computers (SBC). Assuming that perfect security is unrealistic, Moving Target Defense (MTD) is a promising approach to mitigate attacks by dynamically altering target attack surfaces. Still, selecting suitable MTD techniques for zero-day attacks is an open challenge. Reinforcement Learning (RL) could be an effective approach to optimize the MTD selection through trial and error, but the literature fails when i) evaluating the performance of RL and MTD solutions in real-world scenarios, ii) studying whether behavioral fingerprinting is suitable for representing SBC's states, and iii) calculating the consumption of resources in SBC. To improve these limitations, the work at hand proposes an online RL-based framework to learn the correct MTD mechanisms mitigating heterogeneous zero-day attacks in SBC. The framework considers behavioral fingerprinting to represent SBCs' states and RL to learn MTD techniques that mitigate each malicious state. It has been deployed on a real IoT crowdsensing scenario with a Raspberry Pi acting as a spectrum sensor. More in detail, the Raspberry Pi has been infected with different samples of command and control malware, rootkits, and ransomware to later select between four existing MTD techniques. A set of experiments demonstrated the suitability of the framework to learn proper MTD techniques mitigating all attacks (except a harmfulness rootkit) while consuming <1 MB of storage and utilizing <55% CPU and <80% RAM.

translated by 谷歌翻译

Anxolotl, an Anxiety Companion App -- Stress Detection

Nuno Gomes , Matilde Pato , Pedro Santos , André Lourenço , Lourenço Rodrigues

分类：机器学习

2022-12-28

Stress has a great effect on people's lives that can not be understated. While it can be good, since it helps humans to adapt to new and different situations, it can also be harmful when not dealt with properly, leading to chronic stress. The objective of this paper is developing a stress monitoring solution, that can be used in real life, while being able to tackle this challenge in a positive way. The SMILE data set was provided to team Anxolotl, and all it was needed was to develop a robust model. We developed a supervised learning model for classification in Python, presenting the final result of 64.1% in accuracy and a f1-score of 54.96%. The resulting solution stood the robustness test, presenting low variation between runs, which was a major point for it's possible integration in the Anxolotl app in the future.

translated by 谷歌翻译

NeMo: 3D Neural Motion Fields from Multiple Video Instances of the Same Action

Kuan-Chieh Wang , Zhenzhen Weng , Maria Xenochristou , Joao Pedro Araujo , Jeffrey Gu , C. Karen Liu , Serena Yeung

分类：计算机视觉

2022-12-28

The task of reconstructing 3D human motion has wideranging applications. The gold standard Motion capture (MoCap) systems are accurate but inaccessible to the general public due to their cost, hardware and space constraints. In contrast, monocular human mesh recovery (HMR) methods are much more accessible than MoCap as they take single-view videos as inputs. Replacing the multi-view Mo- Cap systems with a monocular HMR method would break the current barriers to collecting accurate 3D motion thus making exciting applications like motion analysis and motiondriven animation accessible to the general public. However, performance of existing HMR methods degrade when the video contains challenging and dynamic motion that is not in existing MoCap datasets used for training. This reduces its appeal as dynamic motion is frequently the target in 3D motion recovery in the aforementioned applications. Our study aims to bridge the gap between monocular HMR and multi-view MoCap systems by leveraging information shared across multiple video instances of the same action. We introduce the Neural Motion (NeMo) field. It is optimized to represent the underlying 3D motions across a set of videos of the same action. Empirically, we show that NeMo can recover 3D motion in sports using videos from the Penn Action dataset, where NeMo outperforms existing HMR methods in terms of 2D keypoint detection. To further validate NeMo using 3D metrics, we collected a small MoCap dataset mimicking actions in Penn Action,and show that NeMo achieves better 3D reconstruction compared to various baselines.

translated by 谷歌翻译

Generating music with sentiment using Transformer-GANs

Pedro Neves , Jose Fornari , João Florindo

分类：机器学习

2022-12-21

The field of Automatic Music Generation has seen significant progress thanks to the advent of Deep Learning. However, most of these results have been produced by unconditional models, which lack the ability to interact with their users, not allowing them to guide the generative process in meaningful and practical ways. Moreover, synthesizing music that remains coherent across longer timescales while still capturing the local aspects that make it sound ``realistic'' or ``human-like'' is still challenging. This is due to the large computational requirements needed to work with long sequences of data, and also to limitations imposed by the training schemes that are often employed. In this paper, we propose a generative model of symbolic music conditioned by data retrieved from human sentiment. The model is a Transformer-GAN trained with labels that correspond to different configurations of the valence and arousal dimensions that quantitatively represent human affective states. We try to tackle both of the problems above by employing an efficient linear version of Attention and using a Discriminator both as a tool to improve the overall quality of the generated music and its ability to follow the conditioning signals.

translated by 谷歌翻译