智能论文笔记

On Developing Facial Stress Analysis and Expression Recognition Platform

Fabio Cacciatori , Sergei Nikolaev , Dmitrii Grigorev

分类：计算机视觉

2022-09-16

这项工作代表了沉浸式数字学习平台的系统面部表达识别和面部压力分析算法的实验和开发过程。该系统从用户网络摄像头检索，并使用人工神经网络（ANN）算法对其进行评估。 ANN输出信号可用于评分和改进学习过程。将ANN适应新系统可能需要大量的实施工作或重复ANN培训。还存在与运行ANN所需的最小硬件有关的局限性。为了使这些限制超过这些约束，提出了一些可能的面部表达识别和面部压力分析算法的实现。新解决方案的实施使得提高识别面部表情的准确性并提高其响应速度成为可能。实验结果表明，与社交设备相比，使用开发的算法可以以更高的速度检测心率。

translated by 谷歌翻译

ReViSe: Remote Vital Signs Measurement Using Smartphone Camera

Donghao Qiao , Amtul Haq Ayesha , Farhana Zulkernine , Raihan Masroor , Nauman Jaffar

分类：计算机视觉 | 机器学习

2022-06-13

远程光插图学（RPPG）是一种快速，有效，廉价和方便的方法，用于收集生物识别数据，因为它可以使用面部视频来估算生命体征。事实证明，远程非接触式医疗服务供应在COVID-19大流行期间是可怕的必要性。我们提出了一个端到端框架，以根据用户的视频中的RPPG方法来衡量人们的生命体征，包括心率（HR），心率变异性（HRV），氧饱和度（SPO2）和血压（BP）（BP）（BP）用智能手机相机捕获的脸。我们以实时的基于深度学习的神经网络模型来提取面部标志。通过使用预测的面部标志来提取多个称为利益区域（ROI）的面部斑块（ROI）。应用了几个过滤器，以减少称为血量脉冲（BVP）信号的提取的心脏信号中ROI的噪声。我们使用两个公共RPPG数据集培训和验证了机器学习模型，即Tokyotech RPPG和脉搏率检测（PURE）数据集，我们的模型在其上实现了以下平均绝对错误（MAE）：a），HR，1.73和3.95 BEATS- beats-beats-beats-beats-beats-beats-beats-beats-beats-beats-beats-beats-beats-beats-beats-beats-s-s-s-s-s-y-peats-beats-beats-beats-ship-s-s-s-in-chin-p-in-in-in-in-in-c--in-in-c-le-in-in- -t一下制。每分钟（bpm），b）分别为HRV，分别为18.55和25.03 ms，c）对于SPO2，纯数据集上的MAE为1.64。我们在现实生活环境中验证了端到端的RPPG框架，修订，从而创建了视频HR数据集。我们的人力资源估计模型在此数据集上达到了2.49 bpm的MAE。由于没有面对视频的BP测量不存在公开可用的RPPG数据集，因此我们使用了带有指标传感器信号的数据集来训练我们的模型，还创建了我们自己的视频数据集Video-BP。在我们的视频BP数据集中，我们的BP估计模型的收缩压（SBP）达到6.7 mmHg，舒张压（DBP）的MAE为9.6 mmHg。

translated by 谷歌翻译

EVM-CNN: Real-Time Contactless Heart Rate Estimation from Facial Video

Ying Qiu , Yang Liu , Juan Arteaga-Falconi , Haiwei Dong , Abdulmotaleb El Saddik

分类：计算机视觉

2022-12-25

With the increase in health consciousness, noninvasive body monitoring has aroused interest among researchers. As one of the most important pieces of physiological information, researchers have remotely estimated the heart rate (HR) from facial videos in recent years. Although progress has been made over the past few years, there are still some limitations, like the processing time increasing with accuracy and the lack of comprehensive and challenging datasets for use and comparison. Recently, it was shown that HR information can be extracted from facial videos by spatial decomposition and temporal filtering. Inspired by this, a new framework is introduced in this paper to remotely estimate the HR under realistic conditions by combining spatial and temporal filtering and a convolutional neural network. Our proposed approach shows better performance compared with the benchmark on the MMSE-HR dataset in terms of both the average HR estimation and short-time HR estimation. High consistency in short-time HR estimation is observed between our method and the ground truth.

translated by 谷歌翻译

An Approach for Improving Automatic Mouth Emotion Recognition

Giulio Biondi , Valentina Franzoni , Osvaldo Gervasi , Damiano Perri

分类：计算机视觉

2022-12-12

The study proposes and tests a technique for automated emotion recognition through mouth detection via Convolutional Neural Networks (CNN), meant to be applied for supporting people with health disorders with communication skills issues (e.g. muscle wasting, stroke, autism, or, more simply, pain) in order to recognize emotions and generate real-time feedback, or data feeding supporting systems. The software system starts the computation identifying if a face is present on the acquired image, then it looks for the mouth location and extracts the corresponding features. Both tasks are carried out using Haar Feature-based Classifiers, which guarantee fast execution and promising performance. If our previous works focused on visual micro-expressions for personalized training on a single user, this strategy aims to train the system also on generalized faces data sets.

translated by 谷歌翻译

Camera Measurement of Physiological Vital Signs

Daniel McDuff

分类：计算机视觉 | 机器学习

2021-11-22

对医疗保健监控的远程工具的需求从未如此明显。摄像机测量生命体征利用成像装置通过分析人体的图像来计算生理变化。建立光学，机器学习，计算机视觉和医学的进步这些技术以来的数码相机的发明以来已经显着进展。本文介绍了对生理生命体征的相机测量综合调查，描述了它们可以测量的重要标志和实现所做的计算技术。我涵盖了临床和非临床应用以及这些应用需要克服的挑战，以便从概念上推进。最后，我描述了对研究社区可用的当前资源（数据集和代码），并提供了一个全面的网页（https://cameravitals.github.io/），其中包含这些资源的链接以及其中引用的所有文件的分类列表文章。

translated by 谷歌翻译

Full Body Video-Based Self-Avatars for Mixed Reality: from E2E System to User Study

Diego Gonzalez Morin , Ester Gonzalez-Sosa , Pablo Perez , Alvaro Villegas

分类：计算机视觉

2022-08-24

在这项工作中，我们通过混合现实（MR）应用中的视频传球来探讨自幻想的创建。我们介绍了我们的端到端系统，包括：在商业头部安装显示器（HMD）上进行自定义MR视频通行证实现，我们基于深度学习的实时egpocentric身体细分算法以及我们优化的卸载体系结构，以交流使用HMD分割服务器。为了验证这项技术，我们设计了一种身临其境的VR体验，用户必须在活跃的火山火山口中穿过狭窄的瓷砖路径。这项研究是在三个身体表示条件下进行的：虚拟手，带有颜色的全身分割的视频传递以及深度学习全身分割的视频通行。这种身临其境的经历由30名女性和28名男性进行。据我们所知，这是首次旨在评估基于视频的自我avatar的用户研究，以代表用户在MR场景中。结果表明，不同身体表示在存在方面没有显着差异，虚拟手和全身表示之间的某些实施方案中等改善。视觉质量结果表明，就整个身体感知和整体分割质量而言，深入学习算法的结果更好。我们提供了一些关于使用基于视频的自我幻想的讨论，以及对评估方法的一些思考。提出的E2E解决方案处于最新技术状态的边界，因此在达到成熟之前仍有改进的空间。但是，该溶液是新型MR分布式溶液的关键起点。

translated by 谷歌翻译

A Web Application for Experimenting and Validating Remote Measurement of Vital Signs

Amtul Haq Ayesha , Donghao Qiao , Farhana Zulkernine

分类：人工智能 | 计算机视觉

2022-08-21

随着在线医疗的激增，需要对患者生命力进行远程监测。这可以通过从面部视频中计算生命体征的远程照相学（RPPG）技术来促进。它涉及处理视频帧以获取皮肤像素，从中提取心脏数据并应用信号处理过滤器以提取血量脉冲（BVP）信号。将不同的算法应用于BVP信号以估计各种生命体征。我们实施了一个Web应用程序框架，以测量一个人的心率（HR），心率变异性（HRV），氧饱和度（SPO2），呼吸率（RR），血压（BP）和面部视频的压力。RPPG技术对照明和运动变化高度敏感。Web应用程序指导用户减少由于这些变化而减少噪音，从而产生清洁器的BVP信号。框架的准确性和鲁棒性在志愿者的帮助下得到了验证。

translated by 谷歌翻译

Efficiency Comparison of AI classification algorithms for Image Detection and Recognition in Real-time

Musarrat Saberin Nipun , Rejwan Bin Sulaiman , Amer Kareem

分类：计算机视觉 | 人工智能

2022-06-12

面部检测和识别是人工智能系统中最困难，经常使用的任务。这项研究的目的是介绍和比较系统中使用的几种面部检测和识别算法的结果。该系统始于人类的训练图像，然后继续进行测试图像，识别面部，将其与受过训练的面部进行比较，最后使用OPENCV分类器对其进行分类。这项研究将讨论系统中使用的最有效，最成功的策略，这些策略是使用Python，OpenCV和Matplotlib实施的。它也可以用于CCTV的位置，例如公共场所，购物中心和ATM摊位。

translated by 谷歌翻译

Deep Learning-Driven Edge Video Analytics: A Survey

Renjie Xu , Saiedeh Razavi , Rong Zheng

分类：计算机视觉 | 机器学习

2022-11-28

Video, as a key driver in the global explosion of digital information, can create tremendous benefits for human society. Governments and enterprises are deploying innumerable cameras for a variety of applications, e.g., law enforcement, emergency management, traffic control, and security surveillance, all facilitated by video analytics (VA). This trend is spurred by the rapid advancement of deep learning (DL), which enables more precise models for object classification, detection, and tracking. Meanwhile, with the proliferation of Internet-connected devices, massive amounts of data are generated daily, overwhelming the cloud. Edge computing, an emerging paradigm that moves workloads and services from the network core to the network edge, has been widely recognized as a promising solution. The resulting new intersection, edge video analytics (EVA), begins to attract widespread attention. Nevertheless, only a few loosely-related surveys exist on this topic. A dedicated venue for collecting and summarizing the latest advances of EVA is highly desired by the community. Besides, the basic concepts of EVA (e.g., definition, architectures, etc.) are ambiguous and neglected by these surveys due to the rapid development of this domain. A thorough clarification is needed to facilitate a consensus on these concepts. To fill in these gaps, we conduct a comprehensive survey of the recent efforts on EVA. In this paper, we first review the fundamentals of edge computing, followed by an overview of VA. The EVA system and its enabling techniques are discussed next. In addition, we introduce prevalent frameworks and datasets to aid future researchers in the development of EVA systems. Finally, we discuss existing challenges and foresee future research directions. We believe this survey will help readers comprehend the relationship between VA and edge computing, and spark new ideas on EVA.

translated by 谷歌翻译

Embedded System Performance Analysis for Implementing a Portable Drowsiness Detection System for Drivers

Minjeong Kim , Jimin Koo

分类：计算机视觉

2022-09-30

Drowsiness on the road is a widespread problem with fatal consequences; thus, a multitude of systems and techniques have been proposed. Among existing methods, Ghoddoosian et al. utilized temporal blinking patterns to detect early signs of drowsiness, but their algorithm was tested only on a powerful desktop computer, which is not practical to apply in a moving vehicle setting. In this paper, we propose an efficient platform to run Ghoddosian's algorithm, detail the performance tests we ran to determine this platform, and explain our threshold optimization logic. After considering the Jetson Nano and Beelink (Mini PC), we concluded that the Mini PC is the most efficient and practical to run our embedded system in a vehicle. To determine this, we ran communication speed tests and evaluated total processing times for inference operations. Based on our experiments, the average total processing time to run the drowsiness detection model was 94.27 ms for Jetson Nano and 22.73 ms for the Beelink (Mini PC). Considering the portability and power efficiency of each device, along with the processing time results, the Beelink (Mini PC) was determined to be most suitable. Also, we propose a threshold optimization algorithm, which determines whether the driver is drowsy or alert based on the trade-off between the sensitivity and specificity of the drowsiness detection model. Our study will serve as a crucial next step for drowsiness detection research and its application in vehicles. Through our experiment, we have determinend a favorable platform that can run drowsiness detection algorithms in real-time and can be used as a foundation to further advance drowsiness detection research. In doing so, we have bridged the gap between an existing embedded system and its actual implementation in vehicles to bring drowsiness technology a step closer to prevalent real-life implementation.

translated by 谷歌翻译

A wearable sensor vest for social humanoid robots with GPGPU, IoT, and modular software architecture

Mohsen Jafarzadeh , Stephen Brooks , Shimeng Yu , Balakrishnan Prabhakaran , Yonas Tadesse

分类：机器人 | 人工智能

2022-01-06

目前，大多数社会机器人通过传感器与周围环境和人类相互作用，这些传感器是机器人的组成部分，这限制了传感器，人机相互作用和互换性的可用性。在许多应用中需要一种适合许多机器人的可穿戴传感器衣服。本文介绍了一个经济实惠的可穿戴传感器背心，以及带有物联网（物联网）的开源软件架构，用于社会人形机器人。背心由触摸，温度，手势，距离，视觉传感器和无线通信模块组成。 IOT功能允许机器人与人类和互联网一起与人类交互。设计的体系结构适用于任何具有通用图形处理单元（GPGPU），I2C / SPI总线，Internet连接和机器人操作系统（ROS）的任何社交机器人。此架构的模块化设计使开发人员能够轻松地添加/删除/更新复杂行为。所提出的软件架构提供IOT技术，GPGPU节点，I2C和SPI总线管理器，视听交互节点（语音到文本，文本到语音和图像理解），以及行为节点和其他节点之间的隔离。所提出的IOT解决方案包括机器人中的相关节点，RESTful Web服务和用户界面。我们使用HTTP协议作为与Internet的社会机器人双向通信的手段。开发人员可以在C，C ++和Python编程语言中轻松编辑或添加节点。我们的架构可用于为社会人形机器人设计更复杂的行为。

translated by 谷歌翻译

3D Labeling Tool

John Rachwan , Charbel Zalaket

分类：计算机视觉 | 人工智能

2022-07-23

培训和测试监督对象检测模型需要大量带有地面真相标签的图像。标签定义图像中的对象类及其位置，形状以及可能的其他信息，例如姿势。即使存在人力，标签过程也非常耗时。我们引入了一个新的标签工具，用于2D图像以及3D三角网格：3D标记工具（3DLT）。这是一个独立的，功能丰富和跨平台软件，不需要安装，并且可以在Windows，MacOS和基于Linux的发行版上运行。我们不再像当前工具那样在每个图像上分别标记相同的对象，而是使用深度信息从上述图像重建三角形网格，并仅在上述网格上标记一次对象。我们使用注册来简化3D标记，离群值检测来改进2D边界框的计算和表面重建，以将标记可能性扩展到大点云。我们的工具经过最先进的方法测试，并且在保持准确性和易用性的同时，它极大地超过了它们。

translated by 谷歌翻译

Smart Application for Fall Detection Using Wearable ECG & Accelerometer Sensors

Harry Wixley

分类：人工智能 | 机器学习

2022-06-28

由于照顾不断增长的老年人口的医疗和财务需求，对跌倒的及时可靠发现是一个大型且快速增长的研究领域。在过去的20年中，高质量硬件（高质量传感器和AI微芯片）和软件（机器学习算法）技术的可用性通过为开发人员提供开发此类系统的功能，从而成为这项研究的催化剂。这项研究开发了多个应用组件，以研究秋季检测系统的发展挑战和选择，并为未来的研究提供材料。使用此方法开发的智能应用程序通过秋季检测模型实验和模型移动部署的结果验证。总体上表现最好的模型是标准化的RESNET152，并带有2S窗口尺寸的调整数据集，可实现92.8％的AUC，7.28％的灵敏度和98.33％的特异性。鉴于这些结果很明显，加速度计和心电图传感器对秋季检测有益，并允许跌倒和其他活动之间的歧视。由于所得数据集中确定的弱点，这项研究为改进的空间留下了很大的改进空间。这些改进包括在跌落的临界阶段使用标签协议，增加数据集样品的数量，改善测试主题表示形式，并通过频域预处理进行实验。

translated by 谷歌翻译

A unified software/hardware scalable architecture for brain-inspired computing based on self-organizing neural models

Artem R. Muliukov , Laurent Rodriguez , Benoit Miramond , Lyes Khacef , Joachim Schmidt , Quentin Berthet , Andres Upegui

分类：神经与进化计算

2022-01-06

在过去的几十年中，人工智能领域大大进展，灵感来自生物学和神经科学领域的发现。这项工作的想法是由来自传入和横向/内部联系的人脑中皮质区域的自组织过程的过程启发。在这项工作中，我们开发了一个原始的脑激发神经模型，将自组织地图（SOM）和Hebbian学习在重新参与索马里（RESOM）模型中。该框架应用于多模式分类问题。与基于未经监督的学习的现有方法相比，该模型增强了最先进的结果。这项工作还通过在名为SPARP（自配置3D蜂窝自适应平台）的专用FPGA的平台上的模拟结果和硬件执行，演示了模型的分布式和可扩展性。头皮板可以以模块化方式互连，以支持神经模型的结构。这种统一的软件和硬件方法使得能够缩放处理并允许来自多个模态的信息进行动态合并。硬件板上的部署提供了在多个设备上并行执行的性能结果，通过专用串行链路在每个板之间的通信。由于多模式关联，所提出的统一架构，由RESOM模型和头皮硬件平台组成的精度显着提高，与集中式GPU实现相比，延迟和功耗之间的良好折衷。

translated by 谷歌翻译

SFF-DA: Sptialtemporal Feature Fusion for Detecting Anxiety Nonintrusively

Haimiao Mo , Yuchen Li , Shanlin Yang , Wei Zhang , Shuai Ding

分类：计算机视觉

2022-08-12

早期发现焦虑症对于减少精神障碍患者的苦难并改善治疗结果至关重要。基于MHealth平台的焦虑筛查在提高筛选效率和降低筛查成本方面具有特殊实用价值。实际上，受试者的身体和心理评估中移动设备的差异以及数据质量不均匀的问题和现实世界中数据的少量数据量使现有方法无效。因此，我们提出了一个基于时空特征融合的框架，用于非触发焦虑。为了降低数据质量不平衡的影响，我们构建了一个基于“ 3DCNN+LSTM”的特征提取网络，并融合了面部行为和非接触式生理学的时空特征。此外，我们设计了一种相似性评估策略，以解决较小的数据样本量导致模型准确性下降的问题。我们的框架已通过现实世界中的机组数据集进行了验证，并且两个公共数据集UBFC-Phys和Swell-KW。实验结果表明，我们框架的总体性能要比最新的比较方法更好。

translated by 谷歌翻译

Hybrid Facial Expression Recognition (FER2013) Model for Real-Time Emotion Classification and Prediction

Ozioma Collins Oguine , Kaleab Alamayehu Kinfu , Kanyifeechukwu Jane Oguine , Hashim Ibrahim Bisallah , Daniel Ofuani

分类：计算机视觉 | 人工智能 | 机器人

2022-06-19

在大多数领域，从人工智能和游戏到人类计算机互动（HCI）和心理学，面部表情识别是一个重要的研究主题。本文提出了一个用于面部表达识别的混合模型，该模型包括深度卷积神经网络（DCNN）和HAAR级联深度学习体系结构。目的是将实时和数字面部图像分类为所考虑的七个面部情感类别之一。这项研究中使用的DCNN具有更多的卷积层，恢复激活功能以及多个内核，以增强滤波深度和面部特征提取。此外，HAAR级联模型还相互用于检测实时图像和视频帧中的面部特征。来自Kaggle存储库（FER-2013）的灰度图像，然后利用图形处理单元（GPU）计算以加快培训和验证过程。预处理和数据增强技术用于提高培训效率和分类性能。实验结果表明，与最先进的实验和研究相比，分类性能有了显着改善的分类性能。同样，与其他常规模型相比，本文验证了所提出的体系结构在分类性能方面表现出色，提高了6％，总计高达70％的精度，并且执行时间较小，为2098.8S。

translated by 谷歌翻译

Proceedings of the 2nd International Workshop on Reading Music Systems

Jorge Calvo-Zaragoza , Alexander Pacha

分类：计算机视觉 | 机器学习

2022-12-01

The International Workshop on Reading Music Systems (WoRMS) is a workshop that tries to connect researchers who develop systems for reading music, such as in the field of Optical Music Recognition, with other researchers and practitioners that could benefit from such systems, like librarians or musicologists. The relevant topics of interest for the workshop include, but are not limited to: Music reading systems; Optical music recognition; Datasets and performance evaluation; Image processing on music scores; Writer identification; Authoring, editing, storing and presentation systems for music scores; Multi-modal systems; Novel input-methods for music to produce written music; Web-based Music Information Retrieval services; Applications and projects; Use-cases related to written music. These are the proceedings of the 2nd International Workshop on Reading Music Systems, held in Delft on the 2nd of November 2019.

translated by 谷歌翻译

A Review of Indoor Millimeter Wave Device-based Localization and Device-free Sensing Technologies

Anish Shastri , Neharika Valecha , Enver Bashirov , Harsh Tataria , Michael Lentmaier , Fredrik Tufvesson , Michele Rossi , Paolo Casari

分类：机器学习

2021-12-10

低成本毫米波（MMWAVE）通信和雷达设备的商业可用性开始提高消费市场中这种技术的渗透，为第五代（5G）的大规模和致密的部署铺平了道路（5G） - 而且以及6G网络。同时，普遍存在MMWAVE访问将使设备定位和无设备的感测，以前所未有的精度，特别是对于Sub-6 GHz商业级设备。本文使用MMWAVE通信和雷达设备在基于设备的定位和无设备感应中进行了现有技术的调查，重点是室内部署。我们首先概述关于MMWAVE信号传播和系统设计的关键概念。然后，我们提供了MMWaves启用的本地化和感应方法和算法的详细说明。我们考虑了在我们的分析中的几个方面，包括每个工作的主要目标，技术和性能，每个研究是否达到了一定程度的实现，并且该硬件平台用于此目的。我们通过讨论消费者级设备的更好算法，密集部署的数据融合方法以及机器学习方法的受过教育应用是有前途，相关和及时的研究方向的结论。

translated by 谷歌翻译

MobilePhys: Personalized Mobile Camera-Based Contactless Physiological Sensing

Xin Liu , Yuntao Wang , Sinan Xie , Xiaoyu Zhang , Zixian Ma , Daniel McDuff , Shwetak Patel

分类：计算机视觉

2022-01-11

基于相机的非接触式光电子溶血性描绘是指一组流行的非接触生理测量技术。目前的最先进的神经模型通常以伴随金标准生理测量的视频以监督方式培训。但是，它们通常概括域名差别示例（即，与培训集中的视频不同）。个性化模型可以帮助提高型号的概括性，但许多个性化技术仍然需要一些金标准数据。为了帮助缓解这一依赖性，在本文中，我们展示了一种名为Mobilememon的新型移动感应系统，该系统是第一个移动个性化远程生理传感系统，它利用智能手机上的前后相机，为培训产生高质量的自我监督标签个性化非接触式相机的PPG模型。为了评估MobilemeLephys的稳健性，我们使用39名参与者进行了一个用户学习，他们在不同的移动设备下完成了一组任务，照明条件/强度，运动任务和皮肤类型。我们的研究结果表明，Mobilephys显着优于最先进的设备监督培训和几次拍摄适应方法。通过广泛的用户研究，我们进一步检查了Mobilephys如何在复杂的真实环境中执行。我们设想，从我们所提出的双摄像机移动传感系统产生的校准或基于相机的非接触式PPG模型将为智能镜，健身和移动健康应用等许多未来应用打开门。

translated by 谷歌翻译

Roadmap on Signal Processing for Next Generation Measurement Systems

D. K. Iakovidis , M. Ooi , Y. C. Kuang , S. Damidenko , A. Shestakov , V. Sinistin , M. Henry , A. Sciacchitano , A. Discetti , S. Donati

分类：人工智能 | 计算机视觉

2021-11-03

信号处理是几乎任何传感器系统的基本组件，具有不同科学学科的广泛应用。时间序列数据，图像和视频序列包括可以增强和分析信息提取和量化的代表性形式的信号。人工智能和机器学习的最近进步正在转向智能，数据驱动，信号处理的研究。该路线图呈现了最先进的方法和应用程序的关键概述，旨在突出未来的挑战和对下一代测量系统的研究机会。它涵盖了广泛的主题，从基础到工业研究，以简明的主题部分组织，反映了每个研究领域的当前和未来发展的趋势和影响。此外，它为研究人员和资助机构提供了识别新前景的指导。

translated by 谷歌翻译