恶意软件是跨越多个操作系统和各种文件格式的计算机的最损害威胁之一。为了防止不断增长的恶意软件的威胁,已经提出了巨大的努力来提出各种恶意软件检测方法,试图有效和有效地检测恶意软件。最近的研究表明,一方面,现有的ML和DL能够卓越地检测新出现和以前看不见的恶意软件。然而,另一方面,ML和DL模型本质上易于侵犯对抗性示例形式的对抗性攻击,这通过略微仔细地扰乱了合法输入来混淆目标模型来恶意地产生。基本上,在计算机视觉领域最初广泛地研究了对抗性攻击,并且一些快速扩展到其他域,包括NLP,语音识别甚至恶意软件检测。在本文中,我们专注于Windows操作系统系列中的便携式可执行文件(PE)文件格式的恶意软件,即Windows PE恶意软件,作为在这种对抗设置中研究对抗性攻击方法的代表性案例。具体而言,我们首先首先概述基于ML / DL的Windows PE恶意软件检测的一般学习框架,随后突出了在PE恶意软件的上下文中执行对抗性攻击的三个独特挑战。然后,我们进行全面和系统的审查,以对PE恶意软件检测以及增加PE恶意软件检测的稳健性的相应防御,对近最新的对手攻击进行分类。我们首先向Windows PE恶意软件检测的其他相关攻击结束除了对抗对抗攻击之外,然后对未来的研究方向和机遇脱落。
translated by 谷歌翻译
逆势培训可针对特异性对抗性扰动有用,但它们也证明旨在展示偏离用于培训的攻击的攻击。然而,我们观察到这种无效性是本质上与域的适应性,深度学习中的另一个关键问题似乎是一个有希望的解决方案。因此,我们提出了ADV-4-ADV作为一种新的逆势培训方法,旨在保持针对看不见的对抗性扰动的鲁棒性。基本上,ADV-4-ADV将攻击产生不同的扰动作为不同的域,并且通过利用逆势域适应的力量,它旨在消除域/攻击特定的功能。这迫使训练有素的模型来学习强大的域名不变的表示,这反过来增强了其泛化能力。对时尚 - MNIST,SVHN,CIFAR-10和CIFAR-100的广泛评估表明,基于由简单攻击(例如,FGSM)制备的样本训练的模型可以推广到更高级的攻击(例如, PGD​​),性能超过了这些数据集的最先进的提案。
translated by 谷歌翻译
对医疗保健和生物医学应用的关键,呼吸监测通常在实践中使用可穿戴传感器,由于它们与人体直接接触而导致不便。因此,研究人员一直在不断寻找免费的接触替代品。尽管如此,现有的无联系设计主要需要人类受试者保持静止,在正常环境中大大限制了身体运动不可避免的日常环境中的收养。幸运的是,透射频率(RF)使能无接触感测,但通过传统过滤不可分割的运动干扰,可以在深度学习的帮助下提供蒸馏呼吸波形的潜力。为了实现这一潜力,我们在身体运动下引入了更多内容以进行细粒度的呼吸监测。更多-fi利用IR-UWB雷达来实现无接触感测,并充分利用复杂的雷达信号进行数据增强。更多-Fi的核心是一种新颖的变分编码器解码器网络;它旨在单独用以非线性方式通过身体运动调节的呼吸波形。我们具有12个受试者和66小时数据的实验表明,尽管身体运动引起的干扰,但仍然需要更准确地恢复呼吸波。我们还讨论了肺部疾病诊断的潜在应用。
translated by 谷歌翻译
Weakly-supervised object localization aims to indicate the category as well as the scope of an object in an image given only the image-level labels. Most of the existing works are based on Class Activation Mapping (CAM) and endeavor to enlarge the discriminative area inside the activation map to perceive the whole object, yet ignore the co-occurrence confounder of the object and context (e.g., fish and water), which makes the model inspection hard to distinguish object boundaries. Besides, the use of CAM also brings a dilemma problem that the classification and localization always suffer from a performance gap and can not reach their highest accuracy simultaneously. In this paper, we propose a casual knowledge distillation method, dubbed KD-CI-CAM, to address these two under-explored issues in one go. More specifically, we tackle the co-occurrence context confounder problem via causal intervention (CI), which explores the causalities among image features, contexts, and categories to eliminate the biased object-context entanglement in the class activation maps. Based on the de-biased object feature, we additionally propose a multi-teacher causal distillation framework to balance the absorption of classification knowledge and localization knowledge during model training. Extensive experiments on several benchmarks demonstrate the effectiveness of KD-CI-CAM in learning clear object boundaries from confounding contexts and addressing the dilemma problem between classification and localization performance.
translated by 谷歌翻译
In this paper, a semantic communication framework for image transmission is developed. In the investigated framework, a set of servers cooperatively transmit images to a set of users utilizing semantic communication techniques. To evaluate the performance of studied semantic communication system, a multimodal metric is proposed to measure the correlation between the extracted semantic information and the original image. To meet the ISS requirement of each user, each server must jointly determine the semantic information to be transmitted and the resource blocks (RBs) used for semantic information transmission. We formulate this problem as an optimization problem aiming to minimize each server's transmission latency while reaching the ISS requirement. To solve this problem, a value decomposition based entropy-maximized multi-agent reinforcement learning (RL) is proposed, which enables servers to coordinate for training and execute RB allocation in a distributed manner to approach to a globally optimal performance with less training iterations. Compared to traditional multi-agent RL, the proposed RL improves the valuable action exploration of servers and the probability of finding a globally optimal RB allocation policy based on local observation. Simulation results show that the proposed algorithm can reduce the transmission delay by up to 16.1% compared to traditional multi-agent RL.
translated by 谷歌翻译
New architecture GPUs like A100 are now equipped with multi-instance GPU (MIG) technology, which allows the GPU to be partitioned into multiple small, isolated instances. This technology provides more flexibility for users to support both deep learning training and inference workloads, but efficiently utilizing it can still be challenging. The vision of this paper is to provide a more comprehensive and practical benchmark study for MIG in order to eliminate the need for tedious manual benchmarking and tuning efforts. To achieve this vision, the paper presents MIGPerf, an open-source tool that streamlines the benchmark study for MIG. Using MIGPerf, the authors conduct a series of experiments, including deep learning training and inference characterization on MIG, GPU sharing characterization, and framework compatibility with MIG. The results of these experiments provide new insights and guidance for users to effectively employ MIG, and lay the foundation for further research on the orchestration of hybrid training and inference workloads on MIGs. The code and results are released on https://github.com/MLSysOps/MIGProfiler. This work is still in progress and more results will be published soon.
translated by 谷歌翻译
With the development of technology and sharing economy, Airbnb as a famous short-term rental platform, has become the first choice for many young people to select. The issue of Airbnb's pricing has always been a problem worth studying. While the previous studies achieve promising results, there are exists deficiencies to solve. Such as, (1) the feature attributes of rental are not rich enough; (2) the research on rental text information is not deep enough; (3) there are few studies on predicting the rental price combined with the point of interest(POI) around the house. To address the above challenges, we proposes a multi-source information embedding(MSIE) model to predict the rental price of Airbnb. Specifically, we first selects the statistical feature to embed the original rental data. Secondly, we generates the word feature vector and emotional score combination of three different text information to form the text feature embedding. Thirdly, we uses the points of interest(POI) around the rental house information generates a variety of spatial network graphs, and learns the embedding of the network to obtain the spatial feature embedding. Finally, this paper combines the three modules into multi source rental representations, and uses the constructed fully connected neural network to predict the price. The analysis of the experimental results shows the effectiveness of our proposed model.
translated by 谷歌翻译
Domain adaptive detection aims to improve the generalization of detectors on target domain. To reduce discrepancy in feature distributions between two domains, recent approaches achieve domain adaption through feature alignment in different granularities via adversarial learning. However, they neglect the relationship between multiple granularities and different features in alignment, degrading detection. Addressing this, we introduce a unified multi-granularity alignment (MGA)-based detection framework for domain-invariant feature learning. The key is to encode the dependencies across different granularities including pixel-, instance-, and category-levels simultaneously to align two domains. Specifically, based on pixel-level features, we first develop an omni-scale gated fusion (OSGF) module to aggregate discriminative representations of instances with scale-aware convolutions, leading to robust multi-scale detection. Besides, we introduce multi-granularity discriminators to identify where, either source or target domains, different granularities of samples come from. Note that, MGA not only leverages instance discriminability in different categories but also exploits category consistency between two domains for detection. Furthermore, we present an adaptive exponential moving average (AEMA) strategy that explores model assessments for model update to improve pseudo labels and alleviate local misalignment problem, boosting detection robustness. Extensive experiments on multiple domain adaption scenarios validate the superiority of MGA over other approaches on FCOS and Faster R-CNN detectors. Code will be released at https://github.com/tiankongzhang/MGA.
translated by 谷歌翻译
Although deep learning has made remarkable progress in processing various types of data such as images, text and speech, they are known to be susceptible to adversarial perturbations: perturbations specifically designed and added to the input to make the target model produce erroneous output. Most of the existing studies on generating adversarial perturbations attempt to perturb the entire input indiscriminately. In this paper, we propose ExploreADV, a general and flexible adversarial attack system that is capable of modeling regional and imperceptible attacks, allowing users to explore various kinds of adversarial examples as needed. We adapt and combine two existing boundary attack methods, DeepFool and Brendel\&Bethge Attack, and propose a mask-constrained adversarial attack system, which generates minimal adversarial perturbations under the pixel-level constraints, namely ``mask-constraints''. We study different ways of generating such mask-constraints considering the variance and importance of the input features, and show that our adversarial attack system offers users good flexibility to focus on sub-regions of inputs, explore imperceptible perturbations and understand the vulnerability of pixels/regions to adversarial attacks. We demonstrate our system to be effective based on extensive experiments and user study.
translated by 谷歌翻译
Depression is a leading cause of death worldwide, and the diagnosis of depression is nontrivial. Multimodal learning is a popular solution for automatic diagnosis of depression, and the existing works suffer two main drawbacks: 1) the high-order interactions between different modalities can not be well exploited; and 2) interpretability of the models are weak. To remedy these drawbacks, we propose a multimodal multi-order factor fusion (MMFF) method. Our method can well exploit the high-order interactions between different modalities by extracting and assembling modality factors under the guide of a shared latent proxy. We conduct extensive experiments on two recent and popular datasets, E-DAIC-WOZ and CMDC, and the results show that our method achieve significantly better performance compared with other existing approaches. Besides, by analyzing the process of factor assembly, our model can intuitively show the contribution of each factor. This helps us understand the fusion mechanism.
translated by 谷歌翻译