智能论文笔记

Physics-informed Neural Networks with Periodic Activation Functions for Solute Transport in Heterogeneous Porous Media

Salah A Faroughi , Pingki Datta , Seyed Kourosh Mahjour , Shirko Faroughi

分类：机器学习

2022-12-17

Solute transport in porous media is relevant to a wide range of applications in hydrogeology, geothermal energy, underground CO2 storage, and a variety of chemical engineering systems. Due to the complexity of solute transport in heterogeneous porous media, traditional solvers require high resolution meshing and are therefore expensive computationally. This study explores the application of a mesh-free method based on deep learning to accelerate the simulation of solute transport. We employ Physics-informed Neural Networks (PiNN) to solve solute transport problems in homogeneous and heterogeneous porous media governed by the advection-dispersion equation. Unlike traditional neural networks that learn from large training datasets, PiNNs only leverage the strong form mathematical models to simultaneously solve for multiple dependent or independent field variables (e.g., pressure and solute concentration fields). In this study, we construct PiNN using a periodic activation function to better represent the complex physical signals (i.e., pressure) and their derivatives (i.e., velocity). Several case studies are designed with the intention of investigating the proposed PiNN's capability to handle different degrees of complexity. A manual hyperparameter tuning method is used to find the best PiNN architecture for each test case. Point-wise error and mean square error (MSE) measures are employed to assess the performance of PiNNs' predictions against the ground truth solutions obtained analytically or numerically using the finite element method. Our findings show that the predictions of PiNN are in good agreement with the ground truth solutions while reducing computational complexity and cost by, at least, three orders of magnitude.

translated by 谷歌翻译

A Survey on Computer Vision based Human Analysis in the COVID-19 Era

Fevziye Irem Eyiokur , Alperen Kantarcı , Mustafa Ekrem Erakın , Naser Damer , Ferda Ofli , Muhammad Imran , Janez Križaj , Albert Ali Salah , Alexander Waibel , Vitomir Štruc

分类：计算机视觉

2022-11-07

The emergence of COVID-19 has had a global and profound impact, not only on society as a whole, but also on the lives of individuals. Various prevention measures were introduced around the world to limit the transmission of the disease, including face masks, mandates for social distancing and regular disinfection in public spaces, and the use of screening applications. These developments also triggered the need for novel and improved computer vision techniques capable of (i) providing support to the prevention measures through an automated analysis of visual data, on the one hand, and (ii) facilitating normal operation of existing vision-based services, such as biometric authentication schemes, on the other. Especially important here, are computer vision techniques that focus on the analysis of people and faces in visual data and have been affected the most by the partial occlusions introduced by the mandates for facial masks. Such computer vision based human analysis techniques include face and face-mask detection approaches, face recognition techniques, crowd counting solutions, age and expression estimation procedures, models for detecting face-hand interactions and many others, and have seen considerable attention over recent years. The goal of this survey is to provide an introduction to the problems induced by COVID-19 into such research and to present a comprehensive review of the work done in the computer vision based human analysis field. Particular attention is paid to the impact of facial masks on the performance of various methods and recent solutions to mitigate this problem. Additionally, a detailed review of existing datasets useful for the development and evaluation of methods for COVID-19 related applications is also provided. Finally, to help advance the field further, a discussion on the main open challenges and future research direction is given.

translated by 谷歌翻译

An End-to-End OCR Framework for Robust Arabic-Handwriting Recognition using a Novel Transformers-based Model and an Innovative 270 Million-Words Multi-Font Corpus of Classical Arabic with Diacritics

Aly Mostafa , Omar Mohamed , Ali Ashraf , Ahmed Elbehery , Salma Jamal , Anas Salah , Amr S. Ghoneim

分类：计算机视觉 | 自然语言处理 | 机器学习

2022-08-20

这项研究是有关阿拉伯历史文档的光学特征识别（OCR）的一系列研究的第二阶段，并研究了不同的建模程序如何与问题相互作用。第一项研究研究了变压器对我们定制的阿拉伯数据集的影响。首次研究的弊端之一是训练数据的规模，由于缺乏资源，我们的3000万张图像中仅15000张图像。另外，我们添加了一个图像增强层，时间和空间优化和后校正层，以帮助该模型预测正确的上下文。值得注意的是，我们提出了一种使用视觉变压器作为编码器的端到端文本识别方法，即BEIT和Vanilla Transformer作为解码器，消除了CNNs以进行特征提取并降低模型的复杂性。实验表明，我们的端到端模型优于卷积骨架。该模型的CER为4.46％。

translated by 谷歌翻译

Feature Representation Learning for Robust Retinal Disease Detection from Optical Coherence Tomography Images

Sharif Amit Kamran , Khondker Fariha Hossain , Alireza Tavakkoli , Stewart Lee Zuckerbrod , Salah A. Baker

分类：计算机视觉

2022-06-24

眼科图像可能包含相同的外观病理，这些病理可能导致自动化技术的失败以区分不同的视网膜退行性疾病。此外，依赖大型注释数据集和缺乏知识蒸馏可以限制基于ML的临床支持系统在现实环境中的部署。为了提高知识的鲁棒性和可传递性，需要一个增强的特征学习模块才能从视网膜子空间中提取有意义的空间表示。这样的模块（如果有效使用）可以检测到独特的疾病特征并区分这种视网膜退行性病理的严重程度。在这项工作中，我们提出了一个具有三个学习头的健壮疾病检测结构，i）是视网膜疾病分类的监督编码器，ii）一种无监督的解码器，用于重建疾病特异性的空间信息，iiii iii）一个新的表示模块，用于学习模块了解编码器折叠功能和增强模型的准确性之间的相似性。我们对两个公开可用的OCT数据集的实验结果表明，该模型在准确性，可解释性和鲁棒性方面优于现有的最新模型，用于分布视网膜外疾病检测。

translated by 谷歌翻译

A Neuromorphic Vision-Based Measurement for Robust Relative Localization in Future Space Exploration Missions

Mohammed Salah , Mohammed Chehadah , Muhammed Humais , Mohammed Wahbah , Abdulla Ayyad , Rana Azzam , Lakmal Senevirante , Yahya Zweiri

分类：计算机视觉

2022-06-23

太空探索目睹了毅力漫游者登陆火星表面，并展示了火星直升机超越地球以外的第一次飞行。在他们在火星上的任务中，毅力漫游者和Ingenuity合作探索了火星表面，Ingenuity侦察员地形信息为Rover的安全穿越。因此，确定两个平台之间的相对姿势对于此任务的成功至关重要。在这种必要性的驱动下，这项工作提出了基于基于神经形态视觉测量（NVBM）和惯性测量的融合的强大相对定位系统。神经形态视觉的出现引发了计算机视觉社区的范式转变，这是由于其独特的工作原理由现场发生的光强度变化触发的异步事件所划定。这意味着由于照明不变性而无法在静态场景中获取观察结果。为了规避这一限制，在场景中插入了高频活动地标，以确保一致的事件射击。这些地标被用作促进相对定位的显着特征。开发了一种新型的基于事件的地标识别算法，使用高斯混合模型（GMM），用于匹配我们NVBM的地标对应。 NVBM与提议的状态估计器中的惯性测量，地标跟踪Kalman滤波器（LTKF）和翻译解耦的Kalman Filter（TDKF）分别用于地标跟踪和相对定位。该系统在各种实验中进行了测试，并且在准确性和范围方面具有优于最先进的方法。

translated by 谷歌翻译

Going Deeper than Tracking: a Survey of Computer-Vision Based Recognition of Animal Pain and Affective States

Sofia Broomé , Marcelo Feighelstein , Anna Zamansky , Gabriel Carreira Lencioni , Pia Haubro Andersen , Francisca Pessanha , Marwa Mahmoud , Hedvig Kjellström , Albert Ali Salah

分类：计算机视觉

2022-06-16

动物运动跟踪和姿势识别的进步一直是动物行为研究的游戏规则改变者。最近，越来越多的作品比跟踪“更深”，并解决了对动物内部状态（例如情绪和痛苦）的自动认识，目的是改善动物福利，这使得这是对该领域进行系统化的及时时刻。本文对基于计算机的识别情感状态和动物的疼痛的研究进行了全面调查，并涉及面部行为和身体行为分析。我们总结了迄今为止在这个主题中所付出的努力 - 对它们进行分类，从不同的维度进行分类，突出挑战和研究差距，并提供最佳实践建议，以推进该领域以及一些未来的研究方向。

translated by 谷歌翻译

A Unified Framework for Adversarial Attack and Defense in Constrained Feature Space

Thibault Simonetto , Salijona Dyrmishi , Salah Ghamizi , Maxime Cordy , Yves Le Traon

分类：人工智能 | 机器学习

2021-12-02

可行对抗示例的产生对于适当评估适用于受约束特征空间的模型是必要的。但是，它仍然是一个具有挑战性的任务，以强制执行用于计算机愿景的攻击。我们提出了一个统一的框架，以产生满足给定域约束的可行的对抗性示例。我们的框架支持文献中报告的使用情况，可以处理线性和非线性约束。我们将框架实例化为两种算法：基于梯度的攻击，引入损耗函数中的约束，以最大化，以及旨在错误分类，扰动最小化和约束满足的多目标搜索算法。我们展示我们的方法在不同域的两个数据集上有效，成功率高达100％，其中最先进的攻击无法生成单个可行的示例。除了对抗性再培训之外，我们还提出引入工程化的非凸起约束，以改善模型对抗性鲁棒性。我们证明这一新防御与对抗性再次一样有效。我们的框架构成了对受约束的对抗性攻击研究的起点，并提供了未来的研究可以利用的相关基线和数据集。

translated by 谷歌翻译

Towards an Efficient Voice Identification Using Wav2Vec2.0 and HuBERT Based on the Quran Reciters Dataset

Aly Moustafa , Salah A. Aly

分类：人工智能 | 自然语言处理 | 机器学习

2021-11-11

当前的身份验证和可信系统依赖于经典和生物识别方法来识别或授权用户。这些方法包括音频语音识别，眼睛和手指签名。最近的工具利用深度学习和变压器来实现更好的结果。在本文中，我们使用Wav2Vec2.0和Hubert音频表示学习工具开发了阿拉伯语扬声器识别的深度学习构建模型。端到端Wav2Vec2.0范例通过随机掩蔽一组特征向量获取上下文化语音表示了解，然后应用变压器神经网络。我们使用了一个MLP分类器，可以区分不变的标记类。我们展示了几种实验结果，可以保护拟议模型的高精度。实验确保了某些扬声器的任意波信号分别可以分别在Wav2Vec2.0和Hubert的情况下以98％和97.1％的精度识别。

translated by 谷歌翻译

ASMDD: Arabic Speech Mispronunciation Detection Dataset

Salah A. Aly , Abdelrahman Salah , Hesham M. Eraqi

分类：自然语言处理 | 人工智能 | 机器学习

2021-11-01

介绍了埃及对话中阿拉伯语语音错误发布检测的最大数据集。DataSet由表示最常用于阿拉伯语中最常用的100个单词的注释音频文件组成，由100埃及儿童（年龄在2到8岁之间）发出明显。通过专家侦听器收集数据集并注释在分段发音错误检测上。

translated by 谷歌翻译

A Tutorial on Parametric Variational Inference

Jens Sjölund

分类： (统计)机器学习 | 机器学习

2023-01-03

Variational inference uses optimization, rather than integration, to approximate the marginal likelihood, and thereby the posterior, in a Bayesian model. Thanks to advances in computational scalability made in the last decade, variational inference is now the preferred choice for many high-dimensional models and large datasets. This tutorial introduces variational inference from the parametric perspective that dominates these recent developments, in contrast to the mean-field perspective commonly found in other introductory texts.

translated by 谷歌翻译