自动化驾驶系统(广告)开辟了汽车行业的新领域,为未来的运输提供了更高的效率和舒适体验的新可能性。然而,在恶劣天气条件下的自主驾驶已经存在,使自动车辆(AVS)长时间保持自主车辆(AVS)或更高的自主权。本文评估了天气在分析和统计方式中为广告传感器带来的影响和挑战,并对恶劣天气条件进行了解决方案。彻底报道了关于对每种天气的感知增强的最先进技术。外部辅助解决方案如V2X技术,当前可用的数据集,模拟器和天气腔室的实验设施中的天气条件覆盖范围明显。通过指出各种主要天气问题,自主驾驶场目前正在面临,近年来审查硬件和计算机科学解决方案,这项调查概述了在不利的天气驾驶条件方面的障碍和方向的障碍和方向。
translated by 谷歌翻译
与基于图像的静态面部表达识别(SFER)任务相比,基于视频序列的动态面部表达识别(DFER)任务更接近自然表达识别场景。但是,DFE通常更具挑战性。主要原因之一是,视频序列通常包含具有不同表达强度的框架,尤其是对于现实世界中的面部表情,而SFER中的图像经常呈现均匀和高表达强度。但是,如果同样处理具有不同强度的表达式,则网络学到的特征将具有较大的阶层和小类间差异,这对DFER有害。为了解决这个问题,我们建议全球卷积注意区(GCA)重新列出特征地图的渠道。此外,我们在训练过程中介绍了强度感知的损失(IAL),以帮助网络区分具有相对较低表达强度的样品。在两个野外动态面部表达数据集(即DFEW和FERV39K)上进行实验表明,我们的方法表现优于最先进的DFER方法。源代码将公开可用。
translated by 谷歌翻译
由于视频序列中的大量嘈杂框架,野外动态面部表达识别(DFER)是一项极具挑战性的任务。以前的作品着重于提取更多的判别特征,但忽略了将关键帧与嘈杂框架区分开来。为了解决这个问题,我们提出了一个噪声动态的面部表达识别网络(NR-DFERNET),该网络可以有效地减少嘈杂框架对DFER任务的干扰。具体而言,在空间阶段,我们设计了一个动态静态融合模块(DSF),该模块(DSF)将动态特征引入静态特征,以学习更多的判别空间特征。为了抑制目标无关框架的影响,我们在时间阶段引入了针对变压器的新型动态类令牌(DCT)。此外,我们在决策阶段设计了基于摘要的滤镜(SF),以减少过多中性帧对非中性序列分类的影响。广泛的实验结果表明,我们的NR-dfernet优于DFEW和AFEW基准的最先进方法。
translated by 谷歌翻译
面部微表达(MES)是非自愿的面部动作,揭示了人们的真实感受,并在精神疾病,国家安全和许多人类计算机互动系统的早期干预中起着重要作用。但是,现有的微表达数据集有限,通常对培训良好的分类器构成一些挑战。为了建模微妙的面部肌肉运动,我们提出了一个健壮的微表达识别(MER)框架,即肌肉运动引导网络(MMNET)。具体而言,引入了连续的注意(CA)块,专注于对局部微妙的肌肉运动模式进行建模,几乎没有身份信息,这与大多数以前的方法不同,这些方法直接从完整的视频框架中提取具有许多身份信息的方法。此外,我们根据视觉变压器设计一个位置校准(PC)模块。通过添加PC模块在两个分支末端产生的面部的位置嵌入,PC模块可以帮助将位置信息添加到MER的面部肌肉运动图案中。在三个公共微表达数据集上进行的广泛实验表明,我们的方法以大幅度优于最先进的方法。
translated by 谷歌翻译
我们研究了在联合环境中从积极和未标记的(PU)数据中学习的问题,由于资源和时间的限制,每个客户仅标记其数据集的一小部分。与传统的PU学习中的设置不同,负面类是由单个类组成的,而由客户在联合设置中无法识别的否定样本可能来自客户未知的多个类。因此,在这种情况下,几乎无法应用现有的PU学习方法。为了解决这个问题,我们提出了一个新颖的框架,即使用正面和未标记的数据(FEDPU)联合学习,以通过利用其他客户的标记数据来最大程度地降低多个负面类别的预期风险。我们理论上分析了拟议的FedPU的概括结合。经验实验表明,FedPU比常规监督和半监督联盟的学习方法取得更好的性能。
translated by 谷歌翻译
由于现代硬件的计算能力强烈增加,在大规模数据集上学习的预训练的深度学习模型(例如,BERT,GPT-3)已经显示了它们对传统方法的有效性。巨大进展主要促进了变压器及其变体架构的代表能力。在本文中,我们研究了低级计算机视觉任务(例如,去噪,超级分辨率和派没),并开发了一个新的预先训练的模型,即图像处理变压器(IPT)。为了最大限度地挖掘变压器的能力,我们展示了利用众所周知的想象网基准,以产生大量损坏的图像对。 IPT模型在具有多头和多尾的这些图像上培训。此外,引入了对比度学习,以适应不同的图像处理任务。因此,在微调后,预先训练的模型可以有效地在所需的任务上使用。只有一个预先训练的模型,IPT优于当前的最先进方法对各种低级基准。代码可在https://github.com/huawei-noah/pretrate -ipt和https://gitee.com/mindspore/mindspore/tree/master/model_zoo/research/cv/ipt
translated by 谷歌翻译
Dataset distillation has emerged as a prominent technique to improve data efficiency when training machine learning models. It encapsulates the knowledge from a large dataset into a smaller synthetic dataset. A model trained on this smaller distilled dataset can attain comparable performance to a model trained on the original training dataset. However, the existing dataset distillation techniques mainly aim at achieving the best trade-off between resource usage efficiency and model utility. The security risks stemming from them have not been explored. This study performs the first backdoor attack against the models trained on the data distilled by dataset distillation models in the image domain. Concretely, we inject triggers into the synthetic data during the distillation procedure rather than during the model training stage, where all previous attacks are performed. We propose two types of backdoor attacks, namely NAIVEATTACK and DOORPING. NAIVEATTACK simply adds triggers to the raw data at the initial distillation phase, while DOORPING iteratively updates the triggers during the entire distillation procedure. We conduct extensive evaluations on multiple datasets, architectures, and dataset distillation techniques. Empirical evaluation shows that NAIVEATTACK achieves decent attack success rate (ASR) scores in some cases, while DOORPING reaches higher ASR scores (close to 1.0) in all cases. Furthermore, we conduct a comprehensive ablation study to analyze the factors that may affect the attack performance. Finally, we evaluate multiple defense mechanisms against our backdoor attacks and show that our attacks can practically circumvent these defense mechanisms.
translated by 谷歌翻译
Blind image quality assessment (BIQA) remains challenging due to the diversity of distortion and image content variation, which complicate the distortion patterns crossing different scales and aggravate the difficulty of the regression problem for BIQA. However, existing BIQA methods often fail to consider multi-scale distortion patterns and image content, and little research has been done on learning strategies to make the regression model produce better performance. In this paper, we propose a simple yet effective Progressive Multi-Task Image Quality Assessment (PMT-IQA) model, which contains a multi-scale feature extraction module (MS) and a progressive multi-task learning module (PMT), to help the model learn complex distortion patterns and better optimize the regression issue to align with the law of human learning process from easy to hard. To verify the effectiveness of the proposed PMT-IQA model, we conduct experiments on four widely used public datasets, and the experimental results indicate that the performance of PMT-IQA is superior to the comparison approaches, and both MS and PMT modules improve the model's performance.
translated by 谷歌翻译
The development of social media user stance detection and bot detection methods rely heavily on large-scale and high-quality benchmarks. However, in addition to low annotation quality, existing benchmarks generally have incomplete user relationships, suppressing graph-based account detection research. To address these issues, we propose a Multi-Relational Graph-Based Twitter Account Detection Benchmark (MGTAB), the first standardized graph-based benchmark for account detection. To our knowledge, MGTAB was built based on the largest original data in the field, with over 1.55 million users and 130 million tweets. MGTAB contains 10,199 expert-annotated users and 7 types of relationships, ensuring high-quality annotation and diversified relations. In MGTAB, we extracted the 20 user property features with the greatest information gain and user tweet features as the user features. In addition, we performed a thorough evaluation of MGTAB and other public datasets. Our experiments found that graph-based approaches are generally more effective than feature-based approaches and perform better when introducing multiple relations. By analyzing experiment results, we identify effective approaches for account detection and provide potential future research directions in this field. Our benchmark and standardized evaluation procedures are freely available at: https://github.com/GraphDetec/MGTAB.
translated by 谷歌翻译
Given the increasingly intricate forms of partial differential equations (PDEs) in physics and related fields, computationally solving PDEs without analytic solutions inevitably suffers from the trade-off between accuracy and efficiency. Recent advances in neural operators, a kind of mesh-independent neural-network-based PDE solvers, have suggested the dawn of overcoming this challenge. In this emerging direction, Koopman neural operator (KNO) is a representative demonstration and outperforms other state-of-the-art alternatives in terms of accuracy and efficiency. Here we present KoopmanLab, a self-contained and user-friendly PyTorch module of the Koopman neural operator family for solving partial differential equations. Beyond the original version of KNO, we develop multiple new variants of KNO based on different neural network architectures to improve the general applicability of our module. These variants are validated by mesh-independent and long-term prediction experiments implemented on representative PDEs (e.g., the Navier-Stokes equation and the Bateman-Burgers equation) and ERA5 (i.e., one of the largest high-resolution data sets of global-scale climate fields). These demonstrations suggest the potential of KoopmanLab to be considered in diverse applications of partial differential equations.
translated by 谷歌翻译