本地语言识别(NLI)是培训(通过监督机器学习)的任务,该分类器猜测文本作者的母语。在过去的十年中,这项任务已经进行了广泛的研究,多年来,NLI系统的性能稳步改善。我们专注于NLI任务的另一个方面,即分析由\ emph {Aupplable}机器学习算法培训的NLI分类器的内部组件,以获取其分类决策的解释,并具有获得的最终目标,即获得最终的目标。深入了解语言现象````赋予说话者''的母语''。我们使用这种观点来解决NLI和(研究得多的)伴侣任务,即猜测是由本地人还是非本地人说的文本。使用三个不同出处的数据集(英语学习者论文的两个数据集和社交媒体帖子的数据集),我们研究哪种语言特征(词汇,形态学,句法和统计)最有效地解决了我们的两项任务,即,最大的表明说话者的L1。我们还提出了两个案例研究,一个关于西班牙语,另一个关于意大利英语学习者,其中我们分析了分类器对发现这些L1最重要的单个语言特征。总体而言,我们的研究表明,使用可解释的机器学习可能是TH的宝贵工具
translated by 谷歌翻译
LeQua 2022是评估“学习”在文本数据集中的“学习”中的方法的新实验室,即,用于对未标记的文本文件集合感兴趣的相对频率的培训预测因子。虽然通过文本分类器首次分类所有文档可以轻松地实现这些预测,但是将分配给类的文档的数量,而且越来越多的文献已经显示出这种方法是次优的,并且已经提出了更好的方法。本实验的目标是为学习测量的方法的比较评估提供一个设置,无论是在二进制设置和单个标签多字符设置中。对于每个这样的设置,我们提供现成的矢量表单或原始文档表单的数据。
translated by 谷歌翻译
越来越多地部署算法和模型来为人们提供决定,不可避免地会影响他们的生活。结果,负责开发这些模型的人必须仔细评估他们对不同人群的影响并偏爱群体公平,也就是说,确保由敏感人口属性(例如种族或性别)确定的群体不会受到不公正的对待。为了实现这一目标,这些人口统计学属性的可用性(意识)是评估这些模型影响的人的基本基础。不幸的是,收集和存储这些属性通常与行业实践以及有关数据最小化和隐私的立法冲突。因此,即使是从开发它们的公司内部,也很难衡量训练有素的模型的群体公平性。在这项工作中,我们通过使用量化技术来解决在敏感属性不认识的情况下衡量群体公平性的问题,这是一项与直接提供群体级别的患病率估算(而不是个人级别的类标签)有关的监督学习任务。我们表明,量化方法特别适合解决未通行问题的公平性,因为它们是可行的不可避免的分配变化,同时将(理想的)目标取消了(不可避免的)允许(不良)的副作用的(理想的)目标个人敏感属性的推断。更详细地说,我们表明,在不认识下的公平性可以作为量化问题,并通过量化文献中的可靠方法解决。我们表明,这些方法在五个实验方案中测量人口统计学的先前方法都优于以前的方法,这对应于使分类器公平性估计不认识的重要挑战。
translated by 谷歌翻译
In this paper we present TruFor, a forensic framework that can be applied to a large variety of image manipulation methods, from classic cheapfakes to more recent manipulations based on deep learning. We rely on the extraction of both high-level and low-level traces through a transformer-based fusion architecture that combines the RGB image and a learned noise-sensitive fingerprint. The latter learns to embed the artifacts related to the camera internal and external processing by training only on real data in a self-supervised manner. Forgeries are detected as deviations from the expected regular pattern that characterizes each pristine image. Looking for anomalies makes the approach able to robustly detect a variety of local manipulations, ensuring generalization. In addition to a pixel-level localization map and a whole-image integrity score, our approach outputs a reliability map that highlights areas where localization predictions may be error-prone. This is particularly important in forensic applications in order to reduce false alarms and allow for a large scale analysis. Extensive experiments on several datasets show that our method is able to reliably detect and localize both cheapfakes and deepfakes manipulations outperforming state-of-the-art works. Code will be publicly available at https://grip-unina.github.io/TruFor/
translated by 谷歌翻译
The shift of public debate to the digital sphere has been accompanied by a rise in online hate speech. While many promising approaches for hate speech classification have been proposed, studies often focus only on a single language, usually English, and do not address three key concerns: post-deployment performance, classifier maintenance and infrastructural limitations. In this paper, we introduce a new human-in-the-loop BERT-based hate speech classification pipeline and trace its development from initial data collection and annotation all the way to post-deployment. Our classifier, trained using data from our original corpus of over 422k examples, is specifically developed for the inherently multilingual setting of Switzerland and outperforms with its F1 score of 80.5 the currently best-performing BERT-based multilingual classifier by 5.8 F1 points in German and 3.6 F1 points in French. Our systematic evaluations over a 12-month period further highlight the vital importance of continuous, human-in-the-loop classifier maintenance to ensure robust hate speech classification post-deployment.
translated by 谷歌翻译
In this paper, we introduce MINTIME, a video deepfake detection approach that captures spatial and temporal anomalies and handles instances of multiple people in the same video and variations in face sizes. Previous approaches disregard such information either by using simple a-posteriori aggregation schemes, i.e., average or max operation, or using only one identity for the inference, i.e., the largest one. On the contrary, the proposed approach builds on a Spatio-Temporal TimeSformer combined with a Convolutional Neural Network backbone to capture spatio-temporal anomalies from the face sequences of multiple identities depicted in a video. This is achieved through an Identity-aware Attention mechanism that attends to each face sequence independently based on a masking operation and facilitates video-level aggregation. In addition, two novel embeddings are employed: (i) the Temporal Coherent Positional Embedding that encodes each face sequence's temporal information and (ii) the Size Embedding that encodes the size of the faces as a ratio to the video frame size. These extensions allow our system to adapt particularly well in the wild by learning how to aggregate information of multiple identities, which is usually disregarded by other methods in the literature. It achieves state-of-the-art results on the ForgeryNet dataset with an improvement of up to 14% AUC in videos containing multiple people and demonstrates ample generalization capabilities in cross-forgery and cross-dataset settings. The code is publicly available at https://github.com/davide-coccomini/MINTIME-Multi-Identity-size-iNvariant-TIMEsformer-for-Video-Deepfake-Detection.
translated by 谷歌翻译
Prescriptive Process Monitoring systems recommend, during the execution of a business process, interventions that, if followed, prevent a negative outcome of the process. Such interventions have to be reliable, that is, they have to guarantee the achievement of the desired outcome or performance, and they have to be flexible, that is, they have to avoid overturning the normal process execution or forcing the execution of a given activity. Most of the existing Prescriptive Process Monitoring solutions, however, while performing well in terms of recommendation reliability, provide the users with very specific (sequences of) activities that have to be executed without caring about the feasibility of these recommendations. In order to face this issue, we propose a new Outcome-Oriented Prescriptive Process Monitoring system recommending temporal relations between activities that have to be guaranteed during the process execution in order to achieve a desired outcome. This softens the mandatory execution of an activity at a given point in time, thus leaving more freedom to the user in deciding the interventions to put in place. Our approach defines these temporal relations with Linear Temporal Logic over finite traces patterns that are used as features to describe the historical process data recorded in an event log by the information systems supporting the execution of the process. Such encoded log is used to train a Machine Learning classifier to learn a mapping between the temporal patterns and the outcome of a process execution. The classifier is then queried at runtime to return as recommendations the most salient temporal patterns to be satisfied to maximize the likelihood of a certain outcome for an input ongoing process execution. The proposed system is assessed using a pool of 22 real-life event logs that have already been used as a benchmark in the Process Mining community.
translated by 谷歌翻译
事实证明,图形神经网络(GNN)在图形结构数据的几个预测建模任务中已被证明。在这些任务中,链接预测是许多现实世界应用(例如推荐系统)的基本问题之一。但是,GNN不能免疫对抗攻击,即精心制作的恶意例子,旨在欺骗预测模型。在这项工作中,我们专注于对基于GNN的链接预测模型进行特定的白盒攻击,其中恶意节点的目的是出现在给定目标受害者的推荐节点列表中。为了实现这一目标,攻击者节点还可以指望它直接控制的其他现有同伴的合作,即在网络中注入许多``vicious''节点的能力。具体而言,所有这些恶意节点都可以添加新的边缘或删除现有的节点,从而扰乱原始图。因此,我们提出了野蛮人,一种新颖的框架和一种安装这种链接预测攻击的方法。野蛮人将对手的目标制定为一项优化任务,从而达到了攻击的有效性与所需的恶意资源的稀疏之间的平衡。在现实世界和合成数据集上进行的广泛实验表明,通过野蛮人实施的对抗性攻击确实达到了很高的攻击成功率,但使用少量恶性节点。最后,尽管这些攻击需要完全了解目标模型,但我们表明它们可以成功地转移到其他黑框方法以进行链接预测。
translated by 谷歌翻译
许多涉及某种形式的3D视觉感知的机器人任务极大地受益于对工作环境的完整知识。但是,机器人通常必须应对非结构化的环境,并且由于工作空间有限,混乱或对象自我划分,它们的车载视觉传感器只能提供不完整的信息。近年来,深度学习架构的形状完成架构已开始将牵引力作为从部分视觉数据中推断出完整的3D对象表示的有效手段。然而,大多数现有的最新方法都以体素电网形式提供了固定的输出分辨率,这与神经网络输出阶段的大小严格相关。尽管这足以完成某些任务,例如导航,抓握和操纵的障碍需要更精细的分辨率,并且简单地扩大神经网络输出在计算上是昂贵的。在本文中,我们通过基于隐式3D表示的对象形状完成方法来解决此限制,该方法为每个重建点提供了置信值。作为第二个贡献,我们提出了一种基于梯度的方法,用于在推理时在任意分辨率下有效地采样这种隐式函数。我们通过将重建的形状与地面真理进行比较,并通过在机器人握把管道中部署形状完成算法来实验验证我们的方法。在这两种情况下,我们将结果与最先进的形状完成方法进行了比较。
translated by 谷歌翻译
由于监视摄像头网络的无处不在,从图像中计算的自动人士最近引起了现代智能城市的城市监测的注意。当前的计算机视觉技术依赖于基于深度学习的算法,这些算法估算了静止图像中的行人密度。只有一堆作品利用视频序列中的时间一致性。在这项工作中,我们提出了一个时空的细心神经网络,以估计监视视频中的行人数量。通过利用连续帧之间的时间相关性,我们在广泛使用的FDST基准上将最新的计数误差降低了5%,定位误差降低了7.5%。
translated by 谷歌翻译