智能论文笔记

Dense-TNT: Efficient Vehicle Type Classification Neural Network Using Satellite Imagery

Ruikang Luo , Yaofeng Song , Han Zhao , Yicheng Zhang , Yi Zhang , Nanbin Zhao , Liping Huang , Rong Su

分类：计算机视觉 | 人工智能

2022-09-27

准确的车辆类型分类在智能运输系统中起重要作用。对于统治者而言，重要的是要了解道路状况，通常为交通灯控制系统的贡献，以相应地响应以减轻交通拥堵。新技术和全面数据源，例如航空照片和遥感数据，提供了更丰富，高维的信息。同样，由于深度神经网络技术的快速发展，基于图像的车辆分类方法可以在处理数据时更好地提取基本的客观特征。最近，已经提出了几种深度学习模型来解决该问题。但是，基于纯卷积的传统方法对全球信息提取有限制，而复杂的环境（例如恶劣的天气）严重限制了识别能力。为了在复杂环境下提高车辆类型的分类能力，本研究提出了一种新型连接的卷积变压器在变压器神经网络（密度TNT）框架中，通过堆叠密集连接的卷积网络（Densenet）和变压器（TNT）（TNT）（TNT）（TNT ）层。部署了三个区域的数据和四个不同的天气条件以评估识别能力。实验发现，即使在严重的雾气天气条件下，我们提出的车辆分类模型的识别能力也很少。

translated by 谷歌翻译

AST-GIN: Attribute-Augmented Spatial-Temporal Graph Informer Network for Electric Vehicle Charging Station Availability Forecasting

Ruikang Luo , Yaofeng Song , Liping Huang , Yicheng Zhang , Rong Su

分类：机器学习

2022-09-07

电动汽车（EV）充电需求和充电站的可用性预测是智能运输系统中的挑战之一。通过准确的EV站情况预测，可以提前安排合适的充电行为以缓解范围焦虑。但是，由于复杂的道路网络结构和全面的外部因素，例如兴趣点（POI）和天气效应，许多现有的深度学习方法用于解决此问题，因此，许多常用算法只能在没有历史用法的情况下提取历史用法考虑外部因素的全面影响。为了提高预测准确性和可解释性，在本研究中提出了属性增强的时空图信息器（AST-GIN）结构，通过将图形卷积网络（GCN）层和告密者层组合来提取外部和内部空间 - 相关运输数据的时间依赖性。并且外部因素被模拟为动态属性，由属性调制的编码器进行训练。测试了邓迪市收集的数据的AST-gin模型，实验结果表明，与其他基线相比，考虑到外部因素对各种地平线环境的影响，我们的模型的有效性。

translated by 谷歌翻译

Examining Political Rhetoric with Epistemic Stance Detection

Ankita Gupta , Su Lin Blodgett , Justin H Gross , Brendan O'Connor

分类：自然语言处理

2022-12-29

Participants in political discourse employ rhetorical strategies -- such as hedging, attributions, or denials -- to display varying degrees of belief commitments to claims proposed by themselves or others. Traditionally, political scientists have studied these epistemic phenomena through labor-intensive manual content analysis. We propose to help automate such work through epistemic stance prediction, drawn from research in computational semantics, to distinguish at the clausal level what is asserted, denied, or only ambivalently suggested by the author or other mentioned entities (belief holders). We first develop a simple RoBERTa-based model for multi-source stance predictions that outperforms more complex state-of-the-art modeling. Then we demonstrate its novel application to political science by conducting a large-scale analysis of the Mass Market Manifestos corpus of U.S. political opinion books, where we characterize trends in cited belief holders -- respected allies and opposed bogeymen -- across U.S. political ideologies.

translated by 谷歌翻译

Detection of Active Emergency Vehicles using Per-Frame CNNs and Output Smoothing

Meng Fan , Craig Bidstrup , Zhaoen Su , Jason Owens , Gary Yang , Nemanja Djuric

分类：计算机视觉

2022-12-28

While inferring common actor states (such as position or velocity) is an important and well-explored task of the perception system aboard a self-driving vehicle (SDV), it may not always provide sufficient information to the SDV. This is especially true in the case of active emergency vehicles (EVs), where light-based signals also need to be captured to provide a full context. We consider this problem and propose a sequential methodology for the detection of active EVs, using an off-the-shelf CNN model operating at a frame level and a downstream smoother that accounts for the temporal aspect of flashing EV lights. We also explore model improvements through data augmentation and training with additional hard samples.

translated by 谷歌翻译

Social-Aware Clustered Federated Learning with Customized Privacy Preservation

Yuntao Wang , Zhou Su , Yanghe Pan , Tom H Luan , Ruidong Li , Shui Yu

分类：机器学习

2022-12-25

A key feature of federated learning (FL) is to preserve the data privacy of end users. However, there still exist potential privacy leakage in exchanging gradients under FL. As a result, recent research often explores the differential privacy (DP) approaches to add noises to the computing results to address privacy concerns with low overheads, which however degrade the model performance. In this paper, we strike the balance of data privacy and efficiency by utilizing the pervasive social connections between users. Specifically, we propose SCFL, a novel Social-aware Clustered Federated Learning scheme, where mutually trusted individuals can freely form a social cluster and aggregate their raw model updates (e.g., gradients) inside each cluster before uploading to the cloud for global aggregation. By mixing model updates in a social group, adversaries can only eavesdrop the social-layer combined results, but not the privacy of individuals. We unfold the design of SCFL in three steps. \emph{i) Stable social cluster formation. Considering users' heterogeneous training samples and data distributions, we formulate the optimal social cluster formation problem as a federation game and devise a fair revenue allocation mechanism to resist free-riders. ii) Differentiated trust-privacy mapping}. For the clusters with low mutual trust, we design a customizable privacy preservation mechanism to adaptively sanitize participants' model updates depending on social trust degrees. iii) Distributed convergence}. A distributed two-sided matching algorithm is devised to attain an optimized disjoint partition with Nash-stable convergence. Experiments on Facebook network and MNIST/CIFAR-10 datasets validate that our SCFL can effectively enhance learning utility, improve user payoff, and enforce customizable privacy protection.

translated by 谷歌翻译

Hybrid Representation Learning for Cognitive Diagnosis in Late-Life Depression Over 5 Years with Structural MRI

Lintao Zhang , Lihong Wang , Minhui Yu , Rong Wu , David C. Steffens , Guy G. Potter , Mingxia Liu

分类：计算机视觉

2022-12-24

Late-life depression (LLD) is a highly prevalent mood disorder occurring in older adults and is frequently accompanied by cognitive impairment (CI). Studies have shown that LLD may increase the risk of Alzheimer's disease (AD). However, the heterogeneity of presentation of geriatric depression suggests that multiple biological mechanisms may underlie it. Current biological research on LLD progression incorporates machine learning that combines neuroimaging data with clinical observations. There are few studies on incident cognitive diagnostic outcomes in LLD based on structural MRI (sMRI). In this paper, we describe the development of a hybrid representation learning (HRL) framework for predicting cognitive diagnosis over 5 years based on T1-weighted sMRI data. Specifically, we first extract prediction-oriented MRI features via a deep neural network, and then integrate them with handcrafted MRI features via a Transformer encoder for cognitive diagnosis prediction. Two tasks are investigated in this work, including (1) identifying cognitively normal subjects with LLD and never-depressed older healthy subjects, and (2) identifying LLD subjects who developed CI (or even AD) and those who stayed cognitively normal over five years. To the best of our knowledge, this is among the first attempts to study the complex heterogeneous progression of LLD based on task-oriented and handcrafted MRI features. We validate the proposed HRL on 294 subjects with T1-weighted MRIs from two clinically harmonized studies. Experimental results suggest that the HRL outperforms several classical machine learning and state-of-the-art deep learning methods in LLD identification and prediction tasks.

translated by 谷歌翻译

MURPHY: Relations Matter in Surgical Workflow Analysis

Shang Zhao , Yanzhe Liu , Qiyuan Wang , Dai Sun , Rong Liu , S. Kevin Zhou

分类：计算机视觉

2022-12-24

Autonomous robotic surgery has advanced significantly based on analysis of visual and temporal cues in surgical workflow, but relational cues from domain knowledge remain under investigation. Complex relations in surgical annotations can be divided into intra- and inter-relations, both valuable to autonomous systems to comprehend surgical workflows. Intra- and inter-relations describe the relevance of various categories within a particular annotation type and the relevance of different annotation types, respectively. This paper aims to systematically investigate the importance of relational cues in surgery. First, we contribute the RLLS12M dataset, a large-scale collection of robotic left lateral sectionectomy (RLLS), by curating 50 videos of 50 patients operated by 5 surgeons and annotating a hierarchical workflow, which consists of 3 inter- and 6 intra-relations, 6 steps, 15 tasks, and 38 activities represented as the triplet of 11 instruments, 8 actions, and 16 objects, totaling 2,113,510 video frames and 12,681,060 annotation entities. Correspondingly, we propose a multi-relation purification hybrid network (MURPHY), which aptly incorporates novel relation modules to augment the feature representation by purifying relational features using the intra- and inter-relations embodied in annotations. The intra-relation module leverages a R-GCN to implant visual features in different graph relations, which are aggregated using a targeted relation purification with affinity information measuring label consistency and feature similarity. The inter-relation module is motivated by attention mechanisms to regularize the influence of relational features based on the hierarchy of annotation types from the domain knowledge. Extensive experimental results on the curated RLLS dataset confirm the effectiveness of our approach, demonstrating that relations matter in surgical workflow analysis.

translated by 谷歌翻译

DuAT: Dual-Aggregation Transformer Network for Medical Image Segmentation

Feilong Tang , Qiming Huang , Jinfeng Wang , Xianxu Hou , Jionglong Su , Jingxin Liu

分类：计算机视觉

2022-12-21

Transformer-based models have been widely demonstrated to be successful in computer vision tasks by modelling long-range dependencies and capturing global representations. However, they are often dominated by features of large patterns leading to the loss of local details (e.g., boundaries and small objects), which are critical in medical image segmentation. To alleviate this problem, we propose a Dual-Aggregation Transformer Network called DuAT, which is characterized by two innovative designs, namely, the Global-to-Local Spatial Aggregation (GLSA) and Selective Boundary Aggregation (SBA) modules. The GLSA has the ability to aggregate and represent both global and local spatial features, which are beneficial for locating large and small objects, respectively. The SBA module is used to aggregate the boundary characteristic from low-level features and semantic information from high-level features for better preserving boundary details and locating the re-calibration objects. Extensive experiments in six benchmark datasets demonstrate that our proposed model outperforms state-of-the-art methods in the segmentation of skin lesion images, and polyps in colonoscopy images. In addition, our approach is more robust than existing methods in various challenging situations such as small object segmentation and ambiguous object boundaries.

translated by 谷歌翻译

Mining User-aware Multi-Relations for Fake News Detection in Large Scale Online Social Networks

Xing Su , Jian Yang , Jia Wu , Yuchen Zhang

分类：人工智能

2022-12-21

Users' involvement in creating and propagating news is a vital aspect of fake news detection in online social networks. Intuitively, credible users are more likely to share trustworthy news, while untrusted users have a higher probability of spreading untrustworthy news. In this paper, we construct a dual-layer graph (i.e., the news layer and the user layer) to extract multiple relations of news and users in social networks to derive rich information for detecting fake news. Based on the dual-layer graph, we propose a fake news detection model named Us-DeFake. It learns the propagation features of news in the news layer and the interaction features of users in the user layer. Through the inter-layer in the graph, Us-DeFake fuses the user signals that contain credibility information into the news features, to provide distinctive user-aware embeddings of news for fake news detection. The training process conducts on multiple dual-layer subgraphs obtained by a graph sampler to scale Us-DeFake in large scale social networks. Extensive experiments on real-world datasets illustrate the superiority of Us-DeFake which outperforms all baselines, and the users' credibility signals learned by interaction relation can notably improve the performance of our model.

translated by 谷歌翻译

Privacy-Preserving Domain Adaptation of Semantic Parsers

Fatemehsadat Mireshghallah , Richard Shin , Yu Su , Tatsunori Hashimoto , Jason Eisner

分类：自然语言处理

2022-12-20

Task-oriented dialogue systems often assist users with personal or confidential matters. For this reason, the developers of such a system are generally prohibited from observing actual usage. So how can they know where the system is failing and needs more training data or new functionality? In this work, we study ways in which realistic user utterances can be generated synthetically, to help increase the linguistic and functional coverage of the system, without compromising the privacy of actual users. To this end, we propose a two-stage Differentially Private (DP) generation method which first generates latent semantic parses, and then generates utterances based on the parses. Our proposed approach improves MAUVE by 3.8$\times$ and parse tree node-type overlap by 1.4$\times$ relative to current approaches for private synthetic data generation, improving both on fluency and semantic coverage. We further validate our approach on a realistic domain adaptation task of adding new functionality from private user data to a semantic parser, and show gains of 1.3$\times$ on its accuracy with the new feature.

translated by 谷歌翻译