智能论文笔记

Towards an Efficient Voice Identification Using Wav2Vec2.0 and HuBERT Based on the Quran Reciters Dataset

Aly Moustafa , Salah A. Aly

分类：人工智能 | 自然语言处理 | 机器学习

2021-11-11

当前的身份验证和可信系统依赖于经典和生物识别方法来识别或授权用户。这些方法包括音频语音识别，眼睛和手指签名。最近的工具利用深度学习和变压器来实现更好的结果。在本文中，我们使用Wav2Vec2.0和Hubert音频表示学习工具开发了阿拉伯语扬声器识别的深度学习构建模型。端到端Wav2Vec2.0范例通过随机掩蔽一组特征向量获取上下文化语音表示了解，然后应用变压器神经网络。我们使用了一个MLP分类器，可以区分不变的标记类。我们展示了几种实验结果，可以保护拟议模型的高精度。实验确保了某些扬声器的任意波信号分别可以分别在Wav2Vec2.0和Hubert的情况下以98％和97.1％的精度识别。

translated by 谷歌翻译

Best-Answer Prediction in Q&A Sites Using User Information

Rafik Hadfi , Ahmed Moustafa , Kai Yoshino , Takayuki Ito

分类：自然语言处理 | 人工智能 | 机器学习

2022-12-15

Community Question Answering (CQA) sites have spread and multiplied significantly in recent years. Sites like Reddit, Quora, and Stack Exchange are becoming popular amongst people interested in finding answers to diverse questions. One practical way of finding such answers is automatically predicting the best candidate given existing answers and comments. Many studies were conducted on answer prediction in CQA but with limited focus on using the background information of the questionnaires. We address this limitation using a novel method for predicting the best answers using the questioner's background information and other features, such as the textual content or the relationships with other participants. Our answer classification model was trained using the Stack Exchange dataset and validated using the Area Under the Curve (AUC) metric. The experimental results show that the proposed method complements previous methods by pointing out the importance of the relationships between users, particularly throughout the level of involvement in different communities on Stack Exchange. Furthermore, we point out that there is little overlap between user-relation information and the information represented by the shallow text features and the meta-features, such as time differences.

translated by 谷歌翻译

Natural Logic-guided Autoregressive Multi-hop Document Retrieval for Fact Verification

Rami Aly , Andreas Vlachos

分类：自然语言处理

2022-12-10

A key component of fact verification is thevevidence retrieval, often from multiple documents. Recent approaches use dense representations and condition the retrieval of each document on the previously retrieved ones. The latter step is performed over all the documents in the collection, requiring storing their dense representations in an index, thus incurring a high memory footprint. An alternative paradigm is retrieve-and-rerank, where documents are retrieved using methods such as BM25, their sentences are reranked, and further documents are retrieved conditioned on these sentences, reducing the memory requirements. However, such approaches can be brittle as they rely on heuristics and assume hyperlinks between documents. We propose a novel retrieve-and-rerank method for multi-hop retrieval, that consists of a retriever that jointly scores documents in the knowledge source and sentences from previously retrieved documents using an autoregressive formulation and is guided by a proof system based on natural logic that dynamically terminates the retrieval process if the evidence is deemed sufficient. This method is competitive with current state-of-the-art methods on FEVER, HoVer and FEVEROUS-S, while using $5$ to $10$ times less memory than competing systems. Evaluation on an adversarial dataset indicates improved stability of our approach compared to commonly deployed threshold-based methods. Finally, the proof system helps humans predict model decisions correctly more often than using the evidence alone.

translated by 谷歌翻译

CONDA: Continual Unsupervised Domain Adaptation Learning in Visual Perception for Self-Driving Cars

Thanh-Dat Truong , Pierce Helton , Ahmed Moustafa , Jackson David Cothren , Khoa Luu

分类：计算机视觉

2022-12-01

Although unsupervised domain adaptation methods have achieved remarkable performance in semantic scene segmentation in visual perception for self-driving cars, these approaches remain impractical in real-world use cases. In practice, the segmentation models may encounter new data that have not been seen yet. Also, the previous data training of segmentation models may be inaccessible due to privacy problems. Therefore, to address these problems, in this work, we propose a Continual Unsupervised Domain Adaptation (CONDA) approach that allows the model to continuously learn and adapt with respect to the presence of the new data. Moreover, our proposed approach is designed without the requirement of accessing previous training data. To avoid the catastrophic forgetting problem and maintain the performance of the segmentation models, we present a novel Bijective Maximum Likelihood loss to impose the constraint of predicted segmentation distribution shifts. The experimental results on the benchmark of continual unsupervised domain adaptation have shown the advanced performance of the proposed CONDA method.

translated by 谷歌翻译

Semantic-Aware Environment Perception for Mobile Human-Robot Interaction

Thorsten Hempel , Marc-André Fiedler , Aly Khalifa , Ayoub Al-Hamadi , Laslo Dinges

分类：机器人 | 人工智能 | 计算机视觉

2022-11-07

Current technological advances open up new opportunities for bringing human-machine interaction to a new level of human-centered cooperation. In this context, a key issue is the semantic understanding of the environment in order to enable mobile robots more complex interactions and a facilitated communication with humans. Prerequisites are the vision-based registration of semantic objects and humans, where the latter are further analyzed for potential interaction partners. Despite significant research achievements, the reliable and fast registration of semantic information still remains a challenging task for mobile robots in real-world scenarios. In this paper, we present a vision-based system for mobile assistive robots to enable a semantic-aware environment perception without additional a-priori knowledge. We deploy our system on a mobile humanoid robot that enables us to test our methods in real-world applications.

translated by 谷歌翻译

An End-to-End OCR Framework for Robust Arabic-Handwriting Recognition using a Novel Transformers-based Model and an Innovative 270 Million-Words Multi-Font Corpus of Classical Arabic with Diacritics

Aly Mostafa , Omar Mohamed , Ali Ashraf , Ahmed Elbehery , Salma Jamal , Anas Salah , Amr S. Ghoneim

分类：计算机视觉 | 自然语言处理 | 机器学习

2022-08-20

这项研究是有关阿拉伯历史文档的光学特征识别（OCR）的一系列研究的第二阶段，并研究了不同的建模程序如何与问题相互作用。第一项研究研究了变压器对我们定制的阿拉伯数据集的影响。首次研究的弊端之一是训练数据的规模，由于缺乏资源，我们的3000万张图像中仅15000张图像。另外，我们添加了一个图像增强层，时间和空间优化和后校正层，以帮助该模型预测正确的上下文。值得注意的是，我们提出了一种使用视觉变压器作为编码器的端到端文本识别方法，即BEIT和Vanilla Transformer作为解码器，消除了CNNs以进行特征提取并降低模型的复杂性。实验表明，我们的端到端模型优于卷积骨架。该模型的CER为4.46％。

translated by 谷歌翻译

Segmentation Enhanced Lameness Detection in Dairy Cows from RGB and Depth Video

Eric Arazo , Robin Aly , Kevin McGuinness

分类：计算机视觉

2022-06-09

牛la脚是一种严重的疾病，会影响奶牛的生命周期和生活质量，并导致巨大的经济损失。早期的la悔检测有助于农民尽早解决疾病，并避免牛的变性引起的负面影响。我们收集了一个简短的奶牛的数据集，穿过走廊，从走廊出发，并注释了牛的la行。本文探讨了结果数据集，并提供了数据收集过程的详细说明。此外，我们提出了一种la行检测方法，该方法利用预先训练的神经网络从视频中提取判别特征，并为每个母牛分配二进制分数，表明其状况：“健康”或“ la脚”。我们通过强迫模型专注于牛的结构来改善这种方法，我们通过用训练有素的分割模型预测的二进制分割掩码来代替RGB视频来实现。这项工作旨在鼓励研究并提供有关计算机视觉模型在农场上的牛lo脚检测的适用性的见解。

translated by 谷歌翻译

BALanCe: Deep Bayesian Active Learning via Equivalence Class Annealing

Renyu Zhang , Aly A. Khan , Robert L. Grossman , Yuxin Chen

分类：机器学习 | 人工智能

2021-12-27

主动学习在许多领域中展示了数据效率。现有的主动学习算法，特别是在深贝叶斯活动模型的背景下，严重依赖模型的不确定性估计的质量。然而，这种不确定性估计可能会严重偏见，特别是有限和不平衡的培训数据。在本文中，我们建议平衡，贝叶斯深度活跃的学习框架，减轻这种偏差的影响。具体地，平衡采用了一种新的采集功能，该函数利用了等效假设类别捕获的结构，并促进了不同的等价类别之间的分化。直观地，每个等价类包括具有类似预测的深层模型的实例化，并且平衡适应地将等同类的大小调整为学习进展。除了完整顺序设置之外，我们还提出批量平衡 - 顺序算法的泛化算法到批量设置 - 有效地选择批次的培训实施例，这些培训实施例是对模型改进的联合有效的培训实施例。我们展示批量平衡在多个基准数据集上实现了最先进的性能，用于主动学习，并且这两个算法都可以有效地处理通常涉及多级和不平衡数据的逼真挑战。

translated by 谷歌翻译

Aerial Base Station Positioning and Power Control for Securing Communications: A Deep Q-Network Approach

Aly Sabri Abdalla , Ali Behfarnia , Vuk Marojevic

分类：机器学习

2021-12-21

无人驾驶飞行器（UAV）是支持各种服务，包括通信的技术突破之一。UAV将在提高无线网络的物理层安全方面发挥关键作用。本文定义了窃听地面用户与UAV之间的链路的问题，该联接器用作空中基站（ABS）。提出了加强学习算法Q - 学习和深Q网络（DQN），用于优化ABS的位置和传输功率，以增强地面用户的数据速率。如果没有系统了解窃听器的位置，这会增加保密容量。与Q-Learnch和基线方法相比，仿真结果显示了拟议DQN的快速收敛性和最高保密能力。

translated by 谷歌翻译

A low-cost wave-solar powered Unmanned Surface Vehicle

Moustafa Elkolali , Ahmed Al-Tawil , Lennard Much , Ryan Schrader , Olivier Masset , Marina Sayols , Andrew Jenkins , Sara Alonso , Alfredo Carella , Alex Alcocer

分类：机器人

2021-12-07

本文介绍了由波浪和太阳能运行的低成本无人面车辆（USV）的原型，该车辆可用于最小化海洋数据收集的成本。目前的原型是一个紧凑的USV，长度为1.2米，可以通过两个人部署和恢复。该设计包括电动绞盘，可用于缩回和降低水下单元。设计的几个要素利用添加剂制造和廉价的材料。通过自定义开发的Web应用，可以使用射频（RF）和卫星通信来控制车辆。通过使用先前的研究工作和先进材料的推荐，在拖曳，提升，重量和价格方面进行了优化了表面和水下装置。通过测量几个参数，例如溶解的氧，盐度，温度和pH，USV可用于水状监测。

translated by 谷歌翻译