准确的车辆类型分类在智能运输系统中起重要作用。对于统治者而言,重要的是要了解道路状况,通常为交通灯控制系统的贡献,以相应地响应以减轻交通拥堵。新技术和全面数据源,例如航空照片和遥感数据,提供了更丰富,高维的信息。同样,由于深度神经网络技术的快速发展,基于图像的车辆分类方法可以在处理数据时更好地提取基本的客观特征。最近,已经提出了几种深度学习模型来解决该问题。但是,基于纯卷积的传统方法对全球信息提取有限制,而复杂的环境(例如恶劣的天气)严重限制了识别能力。为了在复杂环境下提高车辆类型的分类能力,本研究提出了一种新型连接的卷积变压器在变压器神经网络(密度TNT)框架中,通过堆叠密集连接的卷积网络(Densenet)和变压器(TNT)(TNT)(TNT)(TNT )层。部署了三个区域的数据和四个不同的天气条件以评估识别能力。实验发现,即使在严重的雾气天气条件下,我们提出的车辆分类模型的识别能力也很少。
translated by 谷歌翻译
Offline reinforcement learning (RL) enables the agent to effectively learn from logged data, which significantly extends the applicability of RL algorithms in real-world scenarios where exploration can be expensive or unsafe. Previous works have shown that extracting primitive skills from the recurring and temporally extended structures in the logged data yields better learning. However, these methods suffer greatly when the primitives have limited representation ability to recover the original policy space, especially in offline settings. In this paper, we give a quantitative characterization of the performance of offline hierarchical learning and highlight the importance of learning lossless primitives. To this end, we propose to use a \emph{flow}-based structure as the representation for low-level policies. This allows us to represent the behaviors in the dataset faithfully while keeping the expression ability to recover the whole policy space. We show that such lossless primitives can drastically improve the performance of hierarchical policies. The experimental results and extensive ablation studies on the standard D4RL benchmark show that our method has a good representation ability for policies and achieves superior performance in most tasks.
translated by 谷歌翻译
Credit assignment problem of neural networks refers to evaluating the credit of each network component to the final outputs. For an untrained neural network, approaches to tackling it have made great contributions to parameter update and model revolution during the training phase. This problem on trained neural networks receives rare attention, nevertheless, it plays an increasingly important role in neural network patch, specification and verification. Based on Koopman operator theory, this paper presents an alternative perspective of linear dynamics on dealing with the credit assignment problem for trained neural networks. Regarding a neural network as the composition of sub-dynamics series, we utilize step-delay embedding to capture snapshots of each component, characterizing the established mapping as exactly as possible. To circumvent the dimension-difference problem encountered during the embedding, a composition and decomposition of an auxiliary linear layer, termed minimal linear dimension alignment, is carefully designed with rigorous formal guarantee. Afterwards, each component is approximated by a Koopman operator and we derive the Jacobian matrix and its corresponding determinant, similar to backward propagation. Then, we can define a metric with algebraic interpretability for the credit assignment of each network component. Moreover, experiments conducted on typical neural networks demonstrate the effectiveness of the proposed method.
translated by 谷歌翻译
When facing changing environments in the real world, the lightweight model on client devices suffers from severe performance drops under distribution shifts. The main limitations of the existing device model lie in (1) unable to update due to the computation limit of the device, (2) the limited generalization ability of the lightweight model. Meanwhile, recent large models have shown strong generalization capability on the cloud while they can not be deployed on client devices due to poor computation constraints. To enable the device model to deal with changing environments, we propose a new learning paradigm of Cloud-Device Collaborative Continual Adaptation, which encourages collaboration between cloud and device and improves the generalization of the device model. Based on this paradigm, we further propose an Uncertainty-based Visual Prompt Adapted (U-VPA) teacher-student model to transfer the generalization capability of the large model on the cloud to the device model. Specifically, we first design the Uncertainty Guided Sampling (UGS) to screen out challenging data continuously and transmit the most out-of-distribution samples from the device to the cloud. Then we propose a Visual Prompt Learning Strategy with Uncertainty guided updating (VPLU) to specifically deal with the selected samples with more distribution shifts. We transmit the visual prompts to the device and concatenate them with the incoming data to pull the device testing distribution closer to the cloud training distribution. We conduct extensive experiments on two object detection datasets with continually changing environments. Our proposed U-VPA teacher-student framework outperforms previous state-of-the-art test time adaptation and device-cloud collaboration methods. The code and datasets will be released.
translated by 谷歌翻译
Multi-hop Question Answering over Knowledge Graph~(KGQA) aims to find the answer entities that are multiple hops away from the topic entities mentioned in a natural language question on a large-scale Knowledge Graph (KG). To cope with the vast search space, existing work usually adopts a two-stage approach: it firstly retrieves a relatively small subgraph related to the question and then performs the reasoning on the subgraph to accurately find the answer entities. Although these two stages are highly related, previous work employs very different technical solutions for developing the retrieval and reasoning models, neglecting their relatedness in task essence. In this paper, we propose UniKGQA, a novel approach for multi-hop KGQA task, by unifying retrieval and reasoning in both model architecture and parameter learning. For model architecture, UniKGQA consists of a semantic matching module based on a pre-trained language model~(PLM) for question-relation semantic matching, and a matching information propagation module to propagate the matching information along the edges on KGs. For parameter learning, we design a shared pre-training task based on question-relation matching for both retrieval and reasoning models, and then propose retrieval- and reasoning-oriented fine-tuning strategies. Compared with previous studies, our approach is more unified, tightly relating the retrieval and reasoning stages. Extensive experiments on three benchmark datasets have demonstrated the effectiveness of our method on the multi-hop KGQA task. Our codes and data are publicly available at https://github.com/RUCAIBox/UniKGQA.
translated by 谷歌翻译
We propose the tensorizing flow method for estimating high-dimensional probability density functions from the observed data. The method is based on tensor-train and flow-based generative modeling. Our method first efficiently constructs an approximate density in the tensor-train form via solving the tensor cores from a linear system based on the kernel density estimators of low-dimensional marginals. We then train a continuous-time flow model from this tensor-train density to the observed empirical distribution by performing a maximum likelihood estimation. The proposed method combines the optimization-less feature of the tensor-train with the flexibility of the flow-based generative models. Numerical results are included to demonstrate the performance of the proposed method.
translated by 谷歌翻译
Label Shift has been widely believed to be harmful to the generalization performance of machine learning models. Researchers have proposed many approaches to mitigate the impact of the label shift, e.g., balancing the training data. However, these methods often consider the underparametrized regime, where the sample size is much larger than the data dimension. The research under the overparametrized regime is very limited. To bridge this gap, we propose a new asymptotic analysis of the Fisher Linear Discriminant classifier for binary classification with label shift. Specifically, we prove that there exists a phase transition phenomenon: Under certain overparametrized regime, the classifier trained using imbalanced data outperforms the counterpart with reduced balanced data. Moreover, we investigate the impact of regularization to the label shift: The aforementioned phase transition vanishes as the regularization becomes strong.
translated by 谷歌翻译
Neural networks are susceptible to data inference attacks such as the membership inference attack, the adversarial model inversion attack and the attribute inference attack, where the attacker could infer useful information such as the membership, the reconstruction or the sensitive attributes of a data sample from the confidence scores predicted by the target classifier. In this paper, we propose a method, namely PURIFIER, to defend against membership inference attacks. It transforms the confidence score vectors predicted by the target classifier and makes purified confidence scores indistinguishable in individual shape, statistical distribution and prediction label between members and non-members. The experimental results show that PURIFIER helps defend membership inference attacks with high effectiveness and efficiency, outperforming previous defense methods, and also incurs negligible utility loss. Besides, our further experiments show that PURIFIER is also effective in defending adversarial model inversion attacks and attribute inference attacks. For example, the inversion error is raised about 4+ times on the Facescrub530 classifier, and the attribute inference accuracy drops significantly when PURIFIER is deployed in our experiment.
translated by 谷歌翻译
Cross-Lingual Summarization (CLS) aims at generating summaries in one language for the given documents in another language. CLS has attracted wide research attention due to its practical significance in the multi-lingual world. Though great contributions have been made, existing CLS works typically focus on short documents, such as news articles, short dialogues and guides. Different from these short texts, long documents such as academic articles and business reports usually discuss complicated subjects and consist of thousands of words, making them non-trivial to process and summarize. To promote CLS research on long documents, we construct Perseus, the first long-document CLS dataset which collects about 94K Chinese scientific documents paired with English summaries. The average length of documents in Perseus is more than two thousand tokens. As a preliminary study on long-document CLS, we build and evaluate various CLS baselines, including pipeline and end-to-end methods. Experimental results on Perseus show the superiority of the end-to-end baseline, outperforming the strong pipeline models equipped with sophisticated machine translation systems. Furthermore, to provide a deeper understanding, we manually analyze the model outputs and discuss specific challenges faced by current approaches. We hope that our work could benchmark long-document CLS and benefit future studies.
translated by 谷歌翻译
Discriminative pre-trained language models (PLMs) learn to predict original texts from intentionally corrupted ones. Taking the former text as positive and the latter as negative samples, the PLM can be trained effectively for contextualized representation. However, the training of such a type of PLMs highly relies on the quality of the automatically constructed samples. Existing PLMs simply treat all corrupted texts as equal negative without any examination, which actually lets the resulting model inevitably suffer from the false negative issue where training is carried out on pseudo-negative data and leads to less efficiency and less robustness in the resulting PLMs. In this work, on the basis of defining the false negative issue in discriminative PLMs that has been ignored for a long time, we design enhanced pre-training methods to counteract false negative predictions and encourage pre-training language models on true negatives by correcting the harmful gradient updates subject to false negative predictions. Experimental results on GLUE and SQuAD benchmarks show that our counter-false-negative pre-training methods indeed bring about better performance together with stronger robustness.
translated by 谷歌翻译