智能论文笔记

ReAssigner: A Plug-and-Play Virtual Machine Scheduling Intensifier for Heterogeneous Requests

Haochuan Cui , Junjie Sheng , Bo Jin , Yiqiu Hu , Li Su , Lei Zhu , Wenli Zhou , Xiangfeng Wang

分类：人工智能

2022-11-29

With the rapid development of cloud computing, virtual machine scheduling has become one of the most important but challenging issues for the cloud computing community, especially for practical heterogeneous request sequences. By analyzing the impact of request heterogeneity on some popular heuristic schedulers, it can be found that existing scheduling algorithms can not handle the request heterogeneity properly and efficiently. In this paper, a plug-and-play virtual machine scheduling intensifier, called Resource Assigner (ReAssigner), is proposed to enhance the scheduling efficiency of any given scheduler for heterogeneous requests. The key idea of ReAssigner is to pre-assign roles to physical resources and let resources of the same role form a virtual cluster to handle homogeneous requests. ReAssigner can cooperate with arbitrary schedulers by restricting their scheduling space to virtual clusters. With evaluations on the real dataset from Huawei Cloud, the proposed ReAssigner achieves significant scheduling performance improvement compared with some state-of-the-art scheduling methods.

translated by 谷歌翻译

Hybrid Multimodal Feature Extraction, Mining and Fusion for Sentiment Analysis

Jia Li , Ziyang Zhang , Junjie Lang , Yueqi Jiang , Liuwei An , Peng Zou , Yangyang Xu , Sheng Gao , Jie Lin , Chunxiao Fan

分类：计算机视觉 | 自然语言处理

2022-08-05

在本文中，我们介绍了2022年多模式情感分析挑战（MUSE）的解决方案，其中包括Muse-Humor，Muse-Rection和Muse Surns Sub-Challenges。 2022年穆斯穆斯（Muse 2022）着重于幽默检测，情绪反应和多模式的情感压力，利用不同的方式和数据集。在我们的工作中，提取了不同种类的多模式特征，包括声学，视觉，文本和生物学特征。这些功能由Temma和Gru融合到自发机制框架中。在本文中，1）提取了一些新的音频功能，面部表达功能和段落级文本嵌入以进行准确的改进。 2）我们通过挖掘和融合多模式特征来显着提高多模式情感预测的准确性和可靠性。 3）在模型培训中应用有效的数据增强策略，以减轻样本不平衡问题并防止模型形成学习有偏见的主题字符。对于博物馆的子挑战，我们的模型获得了0.8932的AUC分数。对于Muse Rection子挑战，我们在测试集上的Pearson相关系数为0.3879，它的表现优于所有其他参与者。对于Muse Surst Sub-Challenge，我们的方法在测试数据集上的唤醒和价值都优于基线，达到了0.5151的最终综合结果。

translated by 谷歌翻译

Obtaining Dyadic Fairness by Optimal Transport

Moyi Yang , Junjie Sheng , Xiangfeng Wang , Wenyan Liu , Bo Jin , Jun Wang , Hongyuan Zha

分类：机器学习 | 人工智能

2022-02-09

Fairness has been taken as a critical metric in machine learning models, which is considered as an important component of trustworthy machine learning. In this paper, we focus on obtaining fairness for popular link prediction tasks, which are measured by dyadic fairness. A novel pre-processing methodology is proposed to establish dyadic fairness through data repairing based on optimal transport theory. With the well-established theoretical connection between the dyadic fairness for graph link prediction and a conditional distribution alignment problem, the dyadic repairing scheme can be equivalently transformed into a conditional distribution alignment problem. Furthermore, an optimal transport-based dyadic fairness algorithm called DyadicOT is obtained by efficiently solving the alignment problem, satisfying flexibility and unambiguity requirements. The proposed DyadicOT algorithm shows superior results in obtaining fairness compared to other fairness methods on two benchmark graph datasets.

translated by 谷歌翻译

VMAgent: Scheduling Simulator for Reinforcement Learning

Junjie Sheng , Shengliang Cai , Haochuan Cui , Wenhao Li , Yun Hua , Bo Jin , Wenli Zhou , Yiqiu Hu , Lei Zhu , Qian Peng

分类：机器学习 | 人工智能

2021-12-09

介绍了一种名为VMagent的新型模拟器，以帮助RL研究人员更好地探索新方法，特别是对于虚拟机调度。VMagent由实用虚拟机（VM）调度任务的启发，并提供了一个有效的仿真平台，可以反映云计算的实际情况。从实际云计算结束了三种情况（衰落，恢复和扩展），对应于许多强化学习挑战（高维度和行动空间，高于寿命和终身需求）。VMagent为RL研究人员提供了灵活的配置，以设计考虑不同的问题特征的定制调度环境。从VM调度角度来看，VMagent还有助于探索更好的基于学习的调度解决方案。

translated by 谷歌翻译

Cross Modal Transformer via Coordinates Encoding for 3D Object Dectection

Junjie Yan , Yingfei Liu , Jianjian Sun , Fan Jia , Shuailin Li , Tiancai Wang , Xiangyu Zhang

分类：计算机视觉

2023-01-03

In this paper, we propose a robust 3D detector, named Cross Modal Transformer (CMT), for end-to-end 3D multi-modal detection. Without explicit view transformation, CMT takes the image and point clouds tokens as inputs and directly outputs accurate 3D bounding boxes. The spatial alignment of multi-modal tokens is performed implicitly, by encoding the 3D points into multi-modal features. The core design of CMT is quite simple while its performance is impressive. CMT obtains 73.0% NDS on nuScenes benchmark. Moreover, CMT has a strong robustness even if the LiDAR is missing. Code will be released at https://github.com/junjie18/CMT.

translated by 谷歌翻译

Continual Treatment Effect Estimation: Challenges and Opportunities

Zhixuan Chu , Sheng Li

分类：机器学习 | (统计)机器学习

2023-01-03

A further understanding of cause and effect within observational data is critical across many domains, such as economics, health care, public policy, web mining, online advertising, and marketing campaigns. Although significant advances have been made to overcome the challenges in causal effect estimation with observational data, such as missing counterfactual outcomes and selection bias between treatment and control groups, the existing methods mainly focus on source-specific and stationary observational data. Such learning strategies assume that all observational data are already available during the training phase and from only one source. This practical concern of accessibility is ubiquitous in various academic and industrial applications. That's what it boiled down to: in the era of big data, we face new challenges in causal inference with observational data, i.e., the extensibility for incrementally available observational data, the adaptability for extra domain adaptation problem except for the imbalance between treatment and control groups, and the accessibility for an enormous amount of data. In this position paper, we formally define the problem of continual treatment effect estimation, describe its research challenges, and then present possible solutions to this problem. Moreover, we will discuss future research directions on this topic.

translated by 谷歌翻译

FedICT: Federated Multi-task Distillation for Multi-access Edge Computing

Zhiyuan Wu , Sheng Sun , Yuwei Wang , Min Liu , Xuefeng Jiang , Bo Gao

分类：机器学习

2023-01-01

The growing interest in intelligent services and privacy protection for mobile devices has given rise to the widespread application of federated learning in Multi-access Edge Computing (MEC). Diverse user behaviors call for personalized services with heterogeneous Machine Learning (ML) models on different devices. Federated Multi-task Learning (FMTL) is proposed to train related but personalized ML models for different devices, whereas previous works suffer from excessive communication overhead during training and neglect the model heterogeneity among devices in MEC. Introducing knowledge distillation into FMTL can simultaneously enable efficient communication and model heterogeneity among clients, whereas existing methods rely on a public dataset, which is impractical in reality. To tackle this dilemma, Federated MultI-task Distillation for Multi-access Edge CompuTing (FedICT) is proposed. FedICT direct local-global knowledge aloof during bi-directional distillation processes between clients and the server, aiming to enable multi-task clients while alleviating client drift derived from divergent optimization directions of client-side local models. Specifically, FedICT includes Federated Prior Knowledge Distillation (FPKD) and Local Knowledge Adjustment (LKA). FPKD is proposed to reinforce the clients' fitting of local data by introducing prior knowledge of local data distributions. Moreover, LKA is proposed to correct the distillation loss of the server, making the transferred local knowledge better match the generalized representation. Experiments on three datasets show that FedICT significantly outperforms all compared benchmarks in various data heterogeneous and model architecture settings, achieving improved accuracy with less than 1.2% training communication overhead compared with FedAvg and no more than 75% training communication round compared with FedGKT.

translated by 谷歌翻译

New Challenges in Reinforcement Learning: A Survey of Security and Privacy

Yunjiao Lei , Dayong Ye , Sheng Shen , Yulei Sui , Tianqing Zhu , Wanlei Zhou

分类：机器学习 | 人工智能

2022-12-31

Reinforcement learning (RL) is one of the most important branches of AI. Due to its capacity for self-adaption and decision-making in dynamic environments, reinforcement learning has been widely applied in multiple areas, such as healthcare, data markets, autonomous driving, and robotics. However, some of these applications and systems have been shown to be vulnerable to security or privacy attacks, resulting in unreliable or unstable services. A large number of studies have focused on these security and privacy problems in reinforcement learning. However, few surveys have provided a systematic review and comparison of existing problems and state-of-the-art solutions to keep up with the pace of emerging threats. Accordingly, we herein present such a comprehensive review to explain and summarize the challenges associated with security and privacy in reinforcement learning from a new perspective, namely that of the Markov Decision Process (MDP). In this survey, we first introduce the key concepts related to this area. Next, we cover the security and privacy issues linked to the state, action, environment, and reward function of the MDP process, respectively. We further highlight the special characteristics of security and privacy methodologies related to reinforcement learning. Finally, we discuss the possible future research directions within this area.

translated by 谷歌翻译

Peer Learning for Unbiased Scene Graph Generation

Liguang Zhou , Junjie Hu , Yuhongze Zhou , Tin Lun Lam , Yangsheng Xu

分类：计算机视觉

2022-12-31

In this paper, we propose a novel framework dubbed peer learning to deal with the problem of biased scene graph generation (SGG). This framework uses predicate sampling and consensus voting (PSCV) to encourage different peers to learn from each other, improving model diversity and mitigating bias in SGG. To address the heavily long-tailed distribution of predicate classes, we propose to use predicate sampling to divide and conquer this issue. As a result, the model is less biased and makes more balanced predicate predictions. Specifically, one peer may not be sufficiently diverse to discriminate between different levels of predicate distributions. Therefore, we sample the data distribution based on frequency of predicates into sub-distributions, selecting head, body, and tail classes to combine and feed to different peers as complementary predicate knowledge during the training process. The complementary predicate knowledge of these peers is then ensembled utilizing a consensus voting strategy, which simulates a civilized voting process in our society that emphasizes the majority opinion and diminishes the minority opinion. This approach ensures that the learned representations of each peer are optimally adapted to the various data distributions. Extensive experiments on the Visual Genome dataset demonstrate that PSCV outperforms previous methods. We have established a new state-of-the-art (SOTA) on the SGCls task by achieving a mean of \textbf{31.6}.

translated by 谷歌翻译

Attentional Graph Convolutional Network for Structure-aware Audio-Visual Scene Classification

Liguang Zhou , Yuhongze Zhou , Xiaonan Qi , Junjie Hu , Tin Lun Lam , Yangsheng Xu

分类：计算机视觉

2022-12-31

Audio-Visual scene understanding is a challenging problem due to the unstructured spatial-temporal relations that exist in the audio signals and spatial layouts of different objects and various texture patterns in the visual images. Recently, many studies have focused on abstracting features from convolutional neural networks while the learning of explicit semantically relevant frames of sound signals and visual images has been overlooked. To this end, we present an end-to-end framework, namely attentional graph convolutional network (AGCN), for structure-aware audio-visual scene representation. First, the spectrogram of sound and input image is processed by a backbone network for feature extraction. Then, to build multi-scale hierarchical information of input features, we utilize an attention fusion mechanism to aggregate features from multiple layers of the backbone network. Notably, to well represent the salient regions and contextual information of audio-visual inputs, the salient acoustic graph (SAG) and contextual acoustic graph (CAG), salient visual graph (SVG), and contextual visual graph (CVG) are constructed for the audio-visual scene representation. Finally, the constructed graphs pass through a graph convolutional network for structure-aware audio-visual scene recognition. Extensive experimental results on the audio, visual and audio-visual scene recognition datasets show that promising results have been achieved by the AGCN methods. Visualizing graphs on the spectrograms and images have been presented to show the effectiveness of proposed CAG/SAG and CVG/SVG that could focus on the salient and semantic relevant regions.

translated by 谷歌翻译