智能论文笔记

DIY Graphics Tab: A Cost-Effective Alternative to Graphics Tablet for Educators

Mohammad Imrul Jubair , Arafat Ibne Yousuf , Tashfiq Ahmed , Hasanath Jamy , Foisal Reza , Mohsena Ashraf

分类：计算机视觉

2021-12-05

每天，越来越多的人正在转向在线学习，这改变了我们的传统课堂方法。录音讲座一直是在线教育者的正常任务，并且在疫情中最近变得更加重要，因为实际的课程仍在推迟在几个国家。录制讲座时，由于其与计算机接口的便携性和能力，图形平板电脑是一个很大的白板替代白板。然而，这种图形平板电脑对于大多数教师来说太昂贵了。在本文中，我们向教师和教育工作者提出了一种基于计算机视觉的图形平板电脑，这主要以与图形平板电脑相同的方式，而只是需要笔，纸张和笔记本电脑的网络摄像头。我们称之为“自己为自己的图形标签”或“DIY图形选项卡”。我们的系统在由摄像机获取的纸上收到一系列人员写作作为输入的纸张，并输出包含纸张写入内容的屏幕。由于人的手，由于人的手，随机运动，纸张，照明条件不佳，由于视角，透视失真等诸如遮挡等许多障碍物而言。一种管道通过我们的系统，在生成适当的输出之前，进行实例分段和预处理。我们还从教师和学生进行了用户体验评估，并在本文中审查了他们的回复。

translated by 谷歌翻译

Altering Facial Expression Based on Textual Emotion

Mohammad Imrul Jubair , Md. Masud Rana , Md. Amir Hamza , Mohsena Ashraf , Fahim Ahsan Khan , Ahnaf Tahseen Prince

分类：计算机视觉

2021-12-02

面部及其表达是数字图像的有效科目之一。检测图像的情绪是计算机视野领域的古代任务;然而，从图像进行反向合成的面部表达式 - 是非常新的。使用不同面部表情的再生图像的这种操作，或者改变图像中的现有表达需要生成的对抗网络（GaN）。在本文中，我们的目标是使用GaN改变图像中的面部表情，其中具有初始表达式（即，快乐）的输入图像被改变为同一个人的不同表达式（即，厌恶）。我们在Mug数据集的修改版本上使用了Stargn技术来完成此目标。此外，我们通过在从给定文本中的情感指示的图像中重塑面部表情进一步扩展我们的工作。因此，我们应用了一个长期的短期内存（LSTM）方法来从文本中提取情绪并将其转发给我们的表达式更改模块。作为我们的工作管道的演示，我们还创建了一个博客的应用程序原型，该博客将根据用户的文本情绪与不同的表达式重新生成配置文件图片。

translated by 谷歌翻译

Biomedical image analysis competitions: The state of current participation practice

Matthias Eisenmann , Annika Reinke , Vivienn Weru , Minu Dietlinde Tizabi , Fabian Isensee , Tim J. Adler , Patrick Godau , Veronika Cheplygina , Michal Kozubek , Sharib Ali

分类：计算机视觉 | 机器学习

2022-12-16

The number of international benchmarking competitions is steadily increasing in various fields of machine learning (ML) research and practice. So far, however, little is known about the common practice as well as bottlenecks faced by the community in tackling the research questions posed. To shed light on the status quo of algorithm development in the specific field of biomedical imaging analysis, we designed an international survey that was issued to all participants of challenges conducted in conjunction with the IEEE ISBI 2021 and MICCAI 2021 conferences (80 competitions in total). The survey covered participants' expertise and working environments, their chosen strategies, as well as algorithm characteristics. A median of 72% challenge participants took part in the survey. According to our results, knowledge exchange was the primary incentive (70%) for participation, while the reception of prize money played only a minor role (16%). While a median of 80 working hours was spent on method development, a large portion of participants stated that they did not have enough time for method development (32%). 25% perceived the infrastructure to be a bottleneck. Overall, 94% of all solutions were deep learning-based. Of these, 84% were based on standard architectures. 43% of the respondents reported that the data samples (e.g., images) were too large to be processed at once. This was most commonly addressed by patch-based training (69%), downsampling (37%), and solving 3D analysis tasks as a series of 2D tasks. K-fold cross-validation on the training set was performed by only 37% of the participants and only 50% of the participants performed ensembling based on multiple identical models (61%) or heterogeneous models (39%). 48% of the respondents applied postprocessing steps.

translated by 谷歌翻译

A Dependable Hybrid Machine Learning Model for Network Intrusion Detection

Md. Alamin Talukder , Khondokar Fida Hasan , Md. Manowarul Islam , Md Ashraf Uddin , Arnisha Akhter , Mohammand Abu Yousuf , Fares Alharbi , Mohammad Ali Moni

分类：机器学习

2022-12-08

Network intrusion detection systems (NIDSs) play an important role in computer network security. There are several detection mechanisms where anomaly-based automated detection outperforms others significantly. Amid the sophistication and growing number of attacks, dealing with large amounts of data is a recognized issue in the development of anomaly-based NIDS. However, do current models meet the needs of today's networks in terms of required accuracy and dependability? In this research, we propose a new hybrid model that combines machine learning and deep learning to increase detection rates while securing dependability. Our proposed method ensures efficient pre-processing by combining SMOTE for data balancing and XGBoost for feature selection. We compared our developed method to various machine learning and deep learning algorithms to find a more efficient algorithm to implement in the pipeline. Furthermore, we chose the most effective model for network intrusion based on a set of benchmarked performance analysis criteria. Our method produces excellent results when tested on two datasets, KDDCUP'99 and CIC-MalMem-2022, with an accuracy of 99.99% and 100% for KDDCUP'99 and CIC-MalMem-2022, respectively, and no overfitting or Type-1 and Type-2 issues.

translated by 谷歌翻译

Consistency-Based Semi-supervised Evidential Active Learning for Diagnostic Radiograph Classification

Shafa Balaram , Cuong M. Nguyen , Ashraf Kassim , Pavitra Krishnaswamy

分类：计算机视觉 | 人工智能 | 机器学习

2022-09-05

深度学习方法实现了对放射学图像进行分类的最新性能，但依赖于需要专家资源密集型注释的大型标签数据集。半监督学习和积极学习都可以用于减轻这种注释负担。但是，对于多标签医学图像分类，将半监督和主动学习方法的优势结合起来的工作有限。在这里，我们介绍了一种基于一致性的新型半监督证据活跃学习框架（CSEAL）。具体而言，我们利用基于证据和主观逻辑理论的预测不确定性来开发一种端到端的综合方法，该方法将基于一致性的半监督学习与基于不确定性的主动学习相结合。我们采用我们的方法来增强四种基于一致性的半监督学习方法：伪标记，虚拟对抗性培训，卑鄙的老师和不老师。对多标签胸部X射线分类任务的广泛评估表明，CSEAL在两个领先的半监督活跃学习基线方面取得了实质性改进。此外，班级分解的结果表明，我们的方法可以大大提高标记样品较少的稀有异常的准确性。

translated by 谷歌翻译

An End-to-End OCR Framework for Robust Arabic-Handwriting Recognition using a Novel Transformers-based Model and an Innovative 270 Million-Words Multi-Font Corpus of Classical Arabic with Diacritics

Aly Mostafa , Omar Mohamed , Ali Ashraf , Ahmed Elbehery , Salma Jamal , Anas Salah , Amr S. Ghoneim

分类：计算机视觉 | 自然语言处理 | 机器学习

2022-08-20

这项研究是有关阿拉伯历史文档的光学特征识别（OCR）的一系列研究的第二阶段，并研究了不同的建模程序如何与问题相互作用。第一项研究研究了变压器对我们定制的阿拉伯数据集的影响。首次研究的弊端之一是训练数据的规模，由于缺乏资源，我们的3000万张图像中仅15000张图像。另外，我们添加了一个图像增强层，时间和空间优化和后校正层，以帮助该模型预测正确的上下文。值得注意的是，我们提出了一种使用视觉变压器作为编码器的端到端文本识别方法，即BEIT和Vanilla Transformer作为解码器，消除了CNNs以进行特征提取并降低模型的复杂性。实验表明，我们的端到端模型优于卷积骨架。该模型的CER为4.46％。

translated by 谷歌翻译

On the Elements of Datasets for Cyber Physical Systems Security

Ashraf Tantawy

分类：人工智能

2022-08-17

数据集对于将AI算法应用于网络物理系统（CPS）安全性至关重要。由于实际CPS数据集的稀缺性，研究人员选择使用真实或虚拟化测试台生成自己的数据集。但是，与其他AI域不同，CPS是一个复杂的系统，具有许多确定其行为的接口。仅包含传感器测量和网络流量集合的数据集可能不足以开发弹性的AI防御或进攻剂。在本文中，我们研究了捕获系统行为和交互所需的CPS安全数据集的\ emph {Elements}，并提出了一个数据集体系结构，该架构有可能增强AI算法在保护网络物理系统方面的性能。该框架包括数据集元素，攻击表示和所需的数据集功能。我们将现有数据集与建议的体系结构进行比较，以识别当前局限性，并使用TestBeds讨论CPS数据集生成的未来。

translated by 谷歌翻译

A Vision Transformer-Based Approach to Bearing Fault Classification via Vibration Signals

Abid Hasan Zim , Aeyan Ashraf , Aquib Iqbal , Asad Malik , Minoru Kuribayashi

分类：计算机视觉

2022-08-15

滚动轴承是旋转机械的最关键组成部分。及时识别有缺陷的轴承可能会阻止整个机械系统的故障。由于机器零件的快速发展，机械状况监测场已进入大数据阶段。当使用大量数据时，手动特征提取方法的缺点是效率低下和不准确。近年来，诸如深度学习方法之类的数据驱动方法已成功用于机械智能故障检测。卷积神经网络（CNN）主要用于早期研究中，以检测和识别轴承断层。但是，CNN模型遭受了难以管理故障时间信息的缺点，这导致缺乏分类结果。在这项研究中，使用最先进的视觉变压器（VIT）对轴承缺陷进行了分类。使用Case Western Reserve University（CWRU）实验室实验数据对轴承缺陷进行了分类。该研究还考虑了除正常轴承条件外，在0负载情况下的13种不同类型的缺陷。使用短时傅立叶变换（STFT），将振动信号转换为2D时频图像。 2D时频图像用作VIT的输入参数。该模型的总体准确度为98.8％。

translated by 谷歌翻译

Solving the Traveling Salesperson Problem with Precedence Constraints by Deep Reinforcement Learning

Christian Löwens , Muhammad Inaam Ashraf , Alexander Gembus , Genesis Cuizon , Jonas K. Falkner , Lars Schmidt-Thieme

分类：机器学习

2022-07-04

这项工作通过调整适合常规TSP的最新方法，使用深入的加固学习（DRL）提出了使用优先限制（TSPPC）的解决方案。这些方法共有的是基于多头注意（MHA）层的图形模型的使用。解决拾取和交付问题（PDP）的一个想法是使用异质注意来嵌入每个节点可以扮演的不同可能的角色。在这项工作中，我们将这种异质注意的概念推广到TSPPC。此外，我们适应了最近的想法，以使注意力稀疏以获得更好的可扩展性。总体而言，我们通过对解决TSPPC的最新DRL方法的应用和评估为研究界做出了贡献。

translated by 谷歌翻译

Novel Hybrid DNN Approaches for Speaker Verification in Emotional and Stressful Talking Environments

Ismail Shahin , Ali Bou Nassif , Nawel Nemmour , Ashraf Elnagar , Adi Alhudhaif , Kemal Polat

分类：机器学习

2021-12-26

在这项工作中，我们对情感和压力环境中的文本独立扬声器验证性能进行了实证对比研究。这项工作结合了浅架构的深层模型，导致新的混合分类器。利用了四种不同的混合模型：深神经网络隐藏式马尔可夫模型（DNN-HMM），深神经网络 - 高斯混合模型（DNN-GMM），高斯混合模型 - 深神经网络（GMM-DNN）和隐藏的马尔可夫模型-Deep神经网络（HMM-DNN）。所有模型都基于新颖的实施架构。比较研究使用了三个不同的语音数据集：私人阿拉伯数据集和两个公共英语数据库，即在模拟和实际压力下的演讲（Susas）和情感语音和歌曲（Ravdess）的ryerson视听数据库。上述混合模型的测试结果表明，所提出的HMM-DNN利用情绪和压力环境中的验证性能。结果还表明，HMM-DNN在曲线（AUC）评估度量下的相同错误率（eer）和面积方面优于所有其他混合模型。基于三个数据集的平均所产生的验证系统分别基于HMM-DNN，DNN-HMM，DNN-GMM和GMM-DNN产生7.19％，16.85％，11.51％和11.90％的eERs。此外，我们发现，与两个谈话环境中的所有其他混合模型相比，DNN-GMM模型展示了最少的计算复杂性。相反，HMM-DNN模型需要最多的培训时间。调查结果还证明了EER和AUC值在比较平均情绪和压力表演时依赖于数据库。

translated by 谷歌翻译