智能论文笔记

Going Deeper than Tracking: a Survey of Computer-Vision Based Recognition of Animal Pain and Affective States

Sofia Broomé , Marcelo Feighelstein , Anna Zamansky , Gabriel Carreira Lencioni , Pia Haubro Andersen , Francisca Pessanha , Marwa Mahmoud , Hedvig Kjellström , Albert Ali Salah

分类：计算机视觉

2022-06-16

动物运动跟踪和姿势识别的进步一直是动物行为研究的游戏规则改变者。最近，越来越多的作品比跟踪“更深”，并解决了对动物内部状态（例如情绪和痛苦）的自动认识，目的是改善动物福利，这使得这是对该领域进行系统化的及时时刻。本文对基于计算机的识别情感状态和动物的疼痛的研究进行了全面调查，并涉及面部行为和身体行为分析。我们总结了迄今为止在这个主题中所付出的努力 - 对它们进行分类，从不同的维度进行分类，突出挑战和研究差距，并提供最佳实践建议，以推进该领域以及一些未来的研究方向。

translated by 谷歌翻译

Sharing Pain: Using Pain Domain Transfer for Video Recognition of Low Grade Orthopedic Pain in Horses

Sofia Broomé , Katrina Ask , Maheen Rashid , Pia Haubro Andersen , Hedvig Kjellström

分类：计算机视觉

2021-05-21

骨科疾病在马匹中常见，通常导致安乐死，这通常可以通过早期的检测来避免。这些条件通常会产生不同程度的微妙长期疼痛。培训视觉疼痛识别方法具有描绘这种疼痛的视频数据是挑战性的，因为所产生的疼痛行为也是微妙的，稀疏出现，变得不同，使得甚至是专家兰德尔的挑战，为数据提供准确的地面真实性。我们表明，一款专业培训的模型，仅涉及急性实验疼痛的马匹（标签不那么暧昧）可以帮助识别更微妙的骨科疼痛显示。此外，我们提出了一个问题的人类专家基线，以及对各种领域转移方法的广泛实证研究以及由疼痛识别方法检测到矫形数据集的清洁实验疼痛中的疼痛识别方法检测到的内容。最后，这伴随着围绕现实世界动物行为数据集所带来的挑战以及如何为类似的细粒度行动识别任务建立最佳实践的讨论。我们的代码可在https://github.com/sofiabroome/painface-recognition获得。

translated by 谷歌翻译

More is Better: A Database for Spontaneous Micro-Expression with High Frame Rates

Sirui Zhao , Huaying Tang , Xinglong Mao , Shifeng Liu , Hanqing Tao , Hao Wang , Tong Xu , Enhong Chen

分类：计算机视觉

2023-01-03

As one of the most important psychic stress reactions, micro-expressions (MEs), are spontaneous and transient facial expressions that can reveal the genuine emotions of human beings. Thus, recognizing MEs (MER) automatically is becoming increasingly crucial in the field of affective computing, and provides essential technical support in lie detection, psychological analysis and other areas. However, the lack of abundant ME data seriously restricts the development of cutting-edge data-driven MER models. Despite the recent efforts of several spontaneous ME datasets to alleviate this problem, it is still a tiny amount of work. To solve the problem of ME data hunger, we construct a dynamic spontaneous ME dataset with the largest current ME data scale, called DFME (Dynamic Facial Micro-expressions), which includes 7,526 well-labeled ME videos induced by 671 participants and annotated by more than 20 annotators throughout three years. Afterwards, we adopt four classical spatiotemporal feature learning models on DFME to perform MER experiments to objectively verify the validity of DFME dataset. In addition, we explore different solutions to the class imbalance and key-frame sequence sampling problems in dynamic MER respectively on DFME, so as to provide a valuable reference for future research. The comprehensive experimental results show that our DFME dataset can facilitate the research of automatic MER, and provide a new benchmark for MER. DFME will be published via https://mea-lab-421.github.io.

translated by 谷歌翻译

Deep Learning Models for Automated Classification of Dog Emotional States from Facial Expressions

Tali Boneh-Shitrit , Shir Amir , Annika Bremhorst , Daniel S. Mills , Stefanie Riemer , Dror Fried , Anna Zamansky

分类：计算机视觉

2022-06-11

与人类类似，动物的面部表情与情绪状态紧密相关。但是，与人类领域相反，动物面部表情对情绪状态的自动识别是没有充满反应的，这主要是由于数据收集和建立地面真相的困难，涉及非语言用户的情绪状态。我们将最近的深度学习技术应用于在受控的实验环境中收集的数据集上对狗的挫败进行分类和（负面）的挫败感。我们探索在此任务的不同监督下不同骨干（例如，重新连接，VIT）的适用性，并发现自我监督的预定的VIT（DINO-VIT）的特征优于其他替代方案。据我们所知，这项工作是第一个解决对受控实验中获得的数据自动分类的任务。

translated by 谷歌翻译

DAiSEE: Towards User Engagement Recognition in the Wild

Abhay Gupta , Arjun D'Cunha , Kamal Awasthi , Vineeth Balasubramanian

分类：计算机视觉 | 机器学习

2016-09-07

我们介绍了Daisee，这是第一个多标签视频分类数据集，该数据集由112个用户捕获的9068个视频片段，用于识别野外无聊，混乱，参与度和挫败感的用户情感状态。该数据集具有四个级别的标签 - 每个情感状态都非常低，低，高和很高，它们是人群注释并与使用专家心理学家团队创建的黄金标准注释相关的。我们还使用当今可用的最先进的视频分类方法在此数据集上建立了基准结果。我们认为，黛西（Daisee）将为研究社区提供特征提取，基于上下文的推理以及为相关任务开发合适的机器学习方法的挑战，从而为进一步的研究提供了跳板。该数据集可在https://people.iith.ac.in/vineethnb/resources/daisee/daisee/index.html下载。

translated by 谷歌翻译

Deep Learning for Micro-expression Recognition: A Survey

Yante Li , Jinsheng Wei , Yang Liu , Janne Kauttonen , Guoying Zhao

分类：计算机视觉

2021-07-06

微表达（MES）是非自愿的面部运动，揭示了人们在高利害情况下隐藏的感受，并对医疗，国家安全，审讯和许多人机交互系统具有实际重要性。早期的MER方法主要基于传统的外观和几何特征。最近，随着各种领域的深度学习（DL）的成功，神经网络已得到MER的兴趣。不同于宏观表达，MES是自发的，微妙的，快速的面部运动，导致数据收集困难，因此具有小规模的数据集。由于上述我的角色，基于DL的MER变得挑战。迄今为止，已提出各种DL方法来解决我的问题并提高MER表现。在本调查中，我们对深度微表达识别（MER）进行了全面的审查，包括数据集，深度MER管道和最具影响力方法的基准标记。本调查定义了该领域的新分类法，包括基于DL的MER的所有方面。对于每个方面，总结和讨论了基本方法和高级发展。此外，我们得出了坚固的深层MER系统设计的剩余挑战和潜在方向。据我们所知，这是对深度MEL方法的第一次调查，该调查可以作为未来MER研究的参考点。

translated by 谷歌翻译

Open-Source Tools for Behavioral Video Analysis: Setup, Methods, and Development

Kevin Luxem , Jennifer J. Sun , Sean P. Bradley , Keerthi Krishnan , Eric A. Yttri , Jan Zimmermann , Talmo D. Pereira , Mark Laubach

分类：计算机视觉

2022-04-06

Recently developed methods for video analysis, especially models for pose estimation and behavior classification, are transforming behavioral quantification to be more precise, scalable, and reproducible in fields such as neuroscience and ethology. These tools overcome long-standing limitations of manual scoring of video frames and traditional "center of mass" tracking algorithms to enable video analysis at scale. The expansion of open-source tools for video acquisition and analysis has led to new experimental approaches to understand behavior. Here, we review currently available open-source tools for video analysis and discuss how to set up these methods for labs new to video recording. We also discuss best practices for developing and using video analysis methods, including community-wide standards and critical needs for the open sharing of datasets and code, more widespread comparisons of video analysis methods, and better documentation for these methods especially for new users. We encourage broader adoption and continued development of these tools, which have tremendous potential for accelerating scientific progress in understanding the brain and behavior.

translated by 谷歌翻译

Face-to-Face Co-Located Human-Human Social Interaction Analysis using Nonverbal Cues: A Survey

Cigdem Beyan , Alessandro Vinciarelli , Alessio Del Bue

分类：人工智能 | 计算机视觉 | 机器学习

2022-07-20

这项工作对最近的努力进行了系统的综述（自2010年以来），旨在自动分析面对面共同关联的人类社交互动中显示的非语言提示。专注于非语言提示的主要原因是，这些是社会和心理现象的物理，可检测到的痕迹。因此，检测和理解非语言提示至少在一定程度上意味着检测和理解社会和心理现象。所涵盖的主题分为三个：a）建模社会特征，例如领导力，主导，人格特质，b）社会角色认可和社会关系检测以及c）群体凝聚力，同情，rapport和so的互动动态分析向前。我们针对共同的相互作用，其中相互作用的人永远是人类。该调查涵盖了各种各样的环境和场景，包括独立的互动，会议，室内和室外社交交流，二元对话以及人群动态。对于他们每个人，调查都考虑了非语言提示分析的三个主要要素，即数据，传感方法和计算方法。目的是突出显示过去十年的主要进步，指出现有的限制并概述未来的方向。

translated by 谷歌翻译

Applications of Deep Learning in Fish Habitat Monitoring: A Tutorial and Survey

Alzayat Saleh , Marcus Sheaves , Dean Jerry , Mostafa Rahimi Azghadi

分类：计算机视觉

2022-06-11

海洋生态系统及其鱼类栖息地越来越重要，因为它们在提供有价值的食物来源和保护效果方面的重要作用。由于它们的偏僻且难以接近自然，因此通常使用水下摄像头对海洋环境和鱼类栖息地进行监测。这些相机产生了大量数字数据，这些数据无法通过当前的手动处理方法有效地分析，这些方法涉及人类观察者。 DL是一种尖端的AI技术，在分析视觉数据时表现出了前所未有的性能。尽管它应用于无数领域，但仍在探索其在水下鱼类栖息地监测中的使用。在本文中，我们提供了一个涵盖DL的关键概念的教程，该教程可帮助读者了解对DL的工作原理的高级理解。该教程还解释了一个逐步的程序，讲述了如何为诸如水下鱼类监测等挑战性应用开发DL算法。此外，我们还提供了针对鱼类栖息地监测的关键深度学习技术的全面调查，包括分类，计数，定位和细分。此外，我们对水下鱼类数据集进行了公开调查，并比较水下鱼类监测域中的各种DL技术。我们还讨论了鱼类栖息地加工深度学习的新兴领域的一些挑战和机遇。本文是为了作为希望掌握对DL的高级了解，通过遵循我们的分步教程而为其应用开发的海洋科学家的教程，并了解如何发展其研究，以促进他们的研究。努力。同时，它适用于希望调查基于DL的最先进方法的计算机科学家，以进行鱼类栖息地监测。

translated by 谷歌翻译

Py-Feat: Python Facial Expression Analysis Toolbox

Eshin Jolly , Jin Hyun Cheong , Tiankang Xie , Sophie Byrne , Matthew Kenny , Luke J. Chang

分类：计算机视觉 | 机器学习

2021-04-08

Studying facial expressions is a notoriously difficult endeavor. Recent advances in the field of affective computing have yielded impressive progress in automatically detecting facial expressions from pictures and videos. However, much of this work has yet to be widely disseminated in social science domains such as psychology. Current state of the art models require considerable domain expertise that is not traditionally incorporated into social science training programs. Furthermore, there is a notable absence of user-friendly and open-source software that provides a comprehensive set of tools and functions that support facial expression research. In this paper, we introduce Py-Feat, an open-source Python toolbox that provides support for detecting, preprocessing, analyzing, and visualizing facial expression data. Py-Feat makes it easy for domain experts to disseminate and benchmark computer vision models and also for end users to quickly process, analyze, and visualize face expression data. We hope this platform will facilitate increased use of facial expression data in human behavior research.

translated by 谷歌翻译

Camera Measurement of Physiological Vital Signs

Daniel McDuff

分类：计算机视觉 | 机器学习

2021-11-22

对医疗保健监控的远程工具的需求从未如此明显。摄像机测量生命体征利用成像装置通过分析人体的图像来计算生理变化。建立光学，机器学习，计算机视觉和医学的进步这些技术以来的数码相机的发明以来已经显着进展。本文介绍了对生理生命体征的相机测量综合调查，描述了它们可以测量的重要标志和实现所做的计算技术。我涵盖了临床和非临床应用以及这些应用需要克服的挑战，以便从概念上推进。最后，我描述了对研究社区可用的当前资源（数据集和代码），并提供了一个全面的网页（https://cameravitals.github.io/），其中包含这些资源的链接以及其中引用的所有文件的分类列表文章。

translated by 谷歌翻译

Deep Learning -- A first Meta-Survey of selected Reviews across Scientific Disciplines, their Commonalities, Challenges and Research Impact

Jan Egger , Antonio Pepe , Christina Gsaxner , Yuan Jin , Jianning Li , Roman Kern

分类：计算机视觉 | 机器学习 | 神经与进化计算

2020-11-16

深度学习属于人工智能领域，机器执行通常需要某种人类智能的任务。类似于大脑的基本结构，深度学习算法包括一种人工神经网络，其类似于生物脑结构。利用他们的感官模仿人类的学习过程，深入学习网络被送入（感官）数据，如文本，图像，视频或声音。这些网络在不同的任务中优于最先进的方法，因此，整个领域在过去几年中看到了指数增长。这种增长在过去几年中每年超过10,000多种出版物。例如，只有在医疗领域中的所有出版物中覆盖的搜索引擎只能在Q3 2020中覆盖所有出版物的子集，用于搜索术语“深度学习”，其中大约90％来自过去三年。因此，对深度学习领域的完全概述已经不可能在不久的将来获得，并且在不久的将来可能会难以获得难以获得子场的概要。但是，有几个关于深度学习的综述文章，这些文章专注于特定的科学领域或应用程序，例如计算机愿景的深度学习进步或在物体检测等特定任务中进行。随着这些调查作为基础，这一贡献的目的是提供对不同科学学科的深度学习的第一个高级，分类的元调查。根据底层数据来源（图像，语言，医疗，混合）选择了类别（计算机愿景，语言处理，医疗信息和其他工程）。此外，我们还审查了每个子类别的常见架构，方法，专业，利弊，评估，挑战和未来方向。

translated by 谷歌翻译

Can AI detect pain and express pain empathy? A review from emotion recognition and a human-centered AI perspective

Siqi Cao , Di Fu , Xu Yang , Stefan Wermter , Xun Liu , Haiyan Wu

分类：人工智能

2021-10-08

Sensory and emotional experiences such as pain and empathy are essential for mental and physical health. Cognitive neuroscience has been working on revealing mechanisms underlying pain and empathy. Furthermore, as trending research areas, computational pain recognition and empathic artificial intelligence (AI) show progress and promise for healthcare or human-computer interaction. Although AI research has recently made it increasingly possible to create artificial systems with affective processing, most cognitive neuroscience and AI research do not jointly address the issues of empathy in AI and cognitive neuroscience. The main aim of this paper is to introduce key advances, cognitive challenges and technical barriers in computational pain recognition and the implementation of artificial empathy. Our discussion covers the following topics: How can AI recognize pain from unimodal and multimodal information? Is it crucial for AI to be empathic? What are the benefits and challenges of empathic AI? Despite some consensus on the importance of AI, including empathic recognition and responses, we also highlight future challenges for artificial empathy and possible paths from interdisciplinary perspectives. Furthermore, we discuss challenges for responsible evaluation of cognitive methods and computational techniques and show approaches to future work to contribute to affective assistants capable of empathy.

translated by 谷歌翻译

Emotion Recognition in Horses with Convolutional Neural Networks

Luis A. Corujo , Peter A. Gloor , Emily Kieson , Timo Schloesser

分类：计算机视觉 | 机器学习

2021-05-25

创建能够识别情绪的智能系统是一项艰巨的任务，尤其是在看动物的情绪时。本文描述了设计“概念证明”系统以识别马中情绪的过程。该系统由两个元素，一个检测器和模型形成。检测器是一个基于区域快速的卷积神经网络，可检测图像中的马匹。该模型是一个卷积神经网络，可预测这些马匹的情绪。这两个元素接受了多种马匹图像的训练，直到他们在任务上达到了高度准确性。总共收集了400张马匹图像并标记为训练检测器和模型，而40则用于测试系统。一旦两个组件得到验证，它们就会合并为可测试的系统，该系统将根据已建立的行为伦理图检测到马情绪，表明通过头部，颈部，耳朵，枪口和眼睛位置表明情绪影响。该系统在验证集中的准确性为80％，测试集的精度为65％，表明可以使用自主智能系统预测动物的情绪。这样的系统具有多种应用，包括在动物情绪不断增长的领域以及兽医领域的进一步研究，以确定马匹或其他牲畜的身体福利。

translated by 谷歌翻译

Video-based estimation of pain indicators in dogs

Hongyi Zhu , Yasemin Salgırlı , Pınar Can , Durmuş Atılgan , Albert Ali Salah

分类：计算机视觉

2022-09-27

狗主人通常能够识别出揭示其狗的主观状态的行为线索，例如疼痛。但是自动识别疼痛状态非常具有挑战性。本文提出了一种基于视频的新型，两流深的神经网络方法，以解决此问题。我们提取和预处理身体关键点，并在视频中计算关键点和RGB表示的功能。我们提出了一种处理自我十分和缺少关键点的方法。我们还提出了一个由兽医专业人员收集的独特基于视频的狗行为数据集，并注释以进行疼痛，并通过建议的方法报告良好的分类结果。这项研究是基于机器学习的狗疼痛状态估计的第一批作品之一。

translated by 谷歌翻译

A Multimodal Approach for Automatic Mania Assessment in Bipolar Disorder

Pınar Baki

分类：自然语言处理 | 机器学习

2021-12-17

双相情感障碍是一种心理健康障碍，导致情绪波动，从令人沮丧到狂热。双相障碍的诊断通常是根据患者访谈进行的，并从患者的护理人员获得的报告。随后，诊断取决于专家的经验，并且可以与其他精神障碍的疾病混淆。双极性障碍诊断中的自动化过程可以帮助提供定量指标，并让患者的更容易观察较长的时间。此外，在Covid-19大流行期间，对遥控和诊断的需求变得尤为重要。在本论文中，我们根据声学，语言和视觉方式的患者录制来创建一种多模态决策系统。该系统培养在双极障碍语料库上。进行综合分析单峰和多模式系统，以及各种融合技术。除了使用单向特征处理整个患者会话外，还研究了剪辑的任务级调查。在多模式融合系统中使用声学，语言和视觉特征，我们实现了64.8％的未加权平均召回得分，这提高了在该数据集上实现的最先进的性能。

translated by 谷歌翻译

Dimensional Modeling of Emotions in Text with Appraisal Theories: Corpus Creation, Annotation Reliability, and Prediction

Enrica Troiano , Laura Oberländer , Roman Klinger

分类：自然语言处理

2022-06-10

情绪分析中最突出的任务是为文本分配情绪，并了解情绪如何在语言中表现出来。自然语言处理的一个重要观察结果是，即使没有明确提及情感名称，也可以通过单独参考事件来隐式传达情绪。在心理学中，被称为评估理论的情感理论类别旨在解释事件与情感之间的联系。评估可以被形式化为变量，通过他们认为相关的事件的人们的认知评估来衡量认知评估。其中包括评估事件是否是新颖的，如果该人认为自己负责，是否与自己的目标以及许多其他人保持一致。这样的评估解释了哪些情绪是基于事件开发的，例如，新颖的情况会引起惊喜或不确定后果的人可能引起恐惧。我们在文本中分析了评估理论对情绪分析的适用性，目的是理解注释者是否可以可靠地重建评估概念，如果可以通过文本分类器预测，以及评估概念是否有助于识别情感类别。为了实现这一目标，我们通过要求人们发短信描述触发特定情绪并披露其评估的事件来编译语料库。然后，我们要求读者重建文本中的情感和评估。这种设置使我们能够衡量是否可以纯粹从文本中恢复情绪和评估，并为判断模型的绩效指标提供人体基准。我们将文本分类方法与人类注释者的比较表明，两者都可以可靠地检测出具有相似性能的情绪和评估。我们进一步表明，评估概念改善了文本中情绪的分类。

translated by 谷歌翻译

A Survey on Computer Vision based Human Analysis in the COVID-19 Era

Fevziye Irem Eyiokur , Alperen Kantarcı , Mustafa Ekrem Erakın , Naser Damer , Ferda Ofli , Muhammad Imran , Janez Križaj , Albert Ali Salah , Alexander Waibel , Vitomir Štruc

分类：计算机视觉

2022-11-07

The emergence of COVID-19 has had a global and profound impact, not only on society as a whole, but also on the lives of individuals. Various prevention measures were introduced around the world to limit the transmission of the disease, including face masks, mandates for social distancing and regular disinfection in public spaces, and the use of screening applications. These developments also triggered the need for novel and improved computer vision techniques capable of (i) providing support to the prevention measures through an automated analysis of visual data, on the one hand, and (ii) facilitating normal operation of existing vision-based services, such as biometric authentication schemes, on the other. Especially important here, are computer vision techniques that focus on the analysis of people and faces in visual data and have been affected the most by the partial occlusions introduced by the mandates for facial masks. Such computer vision based human analysis techniques include face and face-mask detection approaches, face recognition techniques, crowd counting solutions, age and expression estimation procedures, models for detecting face-hand interactions and many others, and have seen considerable attention over recent years. The goal of this survey is to provide an introduction to the problems induced by COVID-19 into such research and to present a comprehensive review of the work done in the computer vision based human analysis field. Particular attention is paid to the impact of facial masks on the performance of various methods and recent solutions to mitigate this problem. Additionally, a detailed review of existing datasets useful for the development and evaluation of methods for COVID-19 related applications is also provided. Finally, to help advance the field further, a discussion on the main open challenges and future research direction is given.

translated by 谷歌翻译

Synthetic Data in Human Analysis: A Survey

Indu Joshi , Marcel Grimmer , Christian Rathgeb , Christoph Busch , Francois Bremond , Antitza Dantcheva

分类：计算机视觉

2022-08-19

深度神经网络在人类分析中已经普遍存在，增强了应用的性能，例如生物识别识别，动作识别以及人重新识别。但是，此类网络的性能通过可用的培训数据缩放。在人类分析中，对大规模数据集的需求构成了严重的挑战，因为数据收集乏味，廉价，昂贵，并且必须遵守数据保护法。当前的研究研究了\ textit {合成数据}的生成，作为在现场收集真实数据的有效且具有隐私性的替代方案。这项调查介绍了基本定义和方法，在生成和采用合成数据进行人类分析时必不可少。我们进行了一项调查，总结了当前的最新方法以及使用合成数据的主要好处。我们还提供了公开可用的合成数据集和生成模型的概述。最后，我们讨论了该领域的局限性以及开放研究问题。这项调查旨在为人类分析领域的研究人员和从业人员提供。

translated by 谷歌翻译

Affect-driven Ordinal Engagement Measurement from Video

Ali Abedi , Shehroz Khan

分类：计算机视觉

2021-06-21

In education and intervention programs, user engagement has been identified as a major factor in successful program completion. Automatic measurement of user engagement provides helpful information for instructors to meet program objectives and individualize program delivery. In this paper, we present a novel approach for video-based engagement measurement in virtual learning programs. We propose to use affect states, continuous values of valence and arousal extracted from consecutive video frames, along with a new latent affective feature vector and behavioral features for engagement measurement. Deep-learning sequential models are trained and validated on the extracted frame-level features. In addition, due to the fact that engagement is an ordinal variable, we develop the ordinal versions of the above models in order to address the problem of engagement measurement as an ordinal classification problem. We evaluated the performance of the proposed method on the only two publicly available video engagement measurement datasets, DAiSEE and EmotiW-EW, containing videos of students in online learning programs. Our experiments show a state-of-the-art engagement level classification accuracy of 67.4% on the DAiSEE dataset, and a regression mean squared error of 0.0508 on the EmotiW-EW dataset. Our ablation study shows the effectiveness of incorporating affect states and ordinality of engagement in engagement measurement.

translated by 谷歌翻译