Generative models have been widely applied to solve extractive tasks, where parts of the input is extracted to form the desired output, and achieved significant success. For example, in extractive question answering (QA), generative models have constantly yielded state-of-the-art results. In this work, we identify the issue of tokenization inconsistency that is commonly neglected in training these models. This issue damages the extractive nature of these tasks after the input and output are tokenized inconsistently by the tokenizer, and thus leads to performance drop as well as hallucination. We propose a simple yet effective fix to this issue and conduct a case study on extractive QA. We show that, with consistent tokenization, the model performs better in both in-domain and out-of-domain datasets, with a notable average of +1.7 F2 gain when a BART model is trained on SQuAD and evaluated on 8 QA datasets. Further, the model converges faster, and becomes less likely to generate out-of-context answers. With these findings, we would like to call for more attention on how tokenization should be done when solving extractive tasks and recommend applying consistent tokenization during training.
translated by 谷歌翻译
There has been great progress in unifying various table-to-text tasks using a single encoder-decoder model trained via multi-task learning (Xie et al., 2022). However, existing methods typically encode task information with a simple dataset name as a prefix to the encoder. This not only limits the effectiveness of multi-task learning, but also hinders the model's ability to generalize to new domains or tasks that were not seen during training, which is crucial for real-world applications. In this paper, we propose compositional task configurations, a set of prompts prepended to the encoder to improve cross-task generalization of unified models. We design the task configurations to explicitly specify the task type, as well as its input and output types. We show that this not only allows the model to better learn shared knowledge across different tasks at training, but also allows us to control the model by composing new configurations that apply novel input-output combinations in a zero-shot manner. We demonstrate via experiments over ten table-to-text tasks that our method outperforms the UnifiedSKG baseline by noticeable margins in both in-domain and zero-shot settings, with average improvements of +0.5 and +12.6 from using a T5-large backbone, respectively.
translated by 谷歌翻译
This paper focuses on the prevalent performance imbalance in the stages of incremental learning. To avoid obvious stage learning bottlenecks, we propose a brand-new stage-isolation based incremental learning framework, which leverages a series of stage-isolated classifiers to perform the learning task of each stage without the interference of others. To be concrete, to aggregate multiple stage classifiers as a uniform one impartially, we first introduce a temperature-controlled energy metric for indicating the confidence score levels of the stage classifiers. We then propose an anchor-based energy self-normalization strategy to ensure the stage classifiers work at the same energy level. Finally, we design a voting-based inference augmentation strategy for robust inference. The proposed method is rehearsal free and can work for almost all continual learning scenarios. We evaluate the proposed method on four large benchmarks. Extensive results demonstrate the superiority of the proposed method in setting up new state-of-the-art overall performance. \emph{Code is available at} \url{https://github.com/iamwangyabin/ESN}.
translated by 谷歌翻译
Adversarial training is one of the most powerful methods to improve the robustness of pre-trained language models (PLMs). However, this approach is typically more expensive than traditional fine-tuning because of the necessity to generate adversarial examples via gradient descent. Delving into the optimization process of adversarial training, we find that robust connectivity patterns emerge in the early training phase (typically $0.15\sim0.3$ epochs), far before parameters converge. Inspired by this finding, we dig out robust early-bird tickets (i.e., subnetworks) to develop an efficient adversarial training method: (1) searching for robust tickets with structured sparsity in the early stage; (2) fine-tuning robust tickets in the remaining time. To extract the robust tickets as early as possible, we design a ticket convergence metric to automatically terminate the searching process. Experiments show that the proposed efficient adversarial training method can achieve up to $7\times \sim 13 \times$ training speedups while maintaining comparable or even better robustness compared to the most competitive state-of-the-art adversarial training methods.
translated by 谷歌翻译
基于学习的多视图立体声(MVS)方法取得了令人印象深刻的进步,并且近年来超越了传统方法。但是,它们的准确性和完整性仍在挣扎。在本文中,我们提出了一种新方法,以增强受对比度学习和功能匹配启发的现有网络的性能。首先,我们提出了一个对比匹配损失(CML),该损失将正确的匹配点视为正样品,将正确的匹配点视为正样本,并将其他点视为阴性样本,并根据特征的相似性计算对比度损失。我们进一步提出了一个加权局灶性损失(WFL),以提高分类能力,从而削弱了根据预测的置信度,在不重要的区域中低信任像素对损失的贡献。在DTU,坦克和寺庙和混合MVS数据集上进行的广泛实验表明,我们的方法可实现最先进的性能,并在基线网络上取得了重大改进。
translated by 谷歌翻译
大型语言模型在各种问题答案(QA)基准测试方面取得了高度的性能,但其产出的解释性仍然难以捉摸。最近建议将结构化的解释称为“综合树”,以解释和检查质量检查系统的答案。为了更好地生成此类树木,我们提出了一种称为迭代检索生成推理​​器(IRGR)的架构。我们的模型能够通过系统地生成文本前提的分步解释来解释给定的假设。 IRGR模型迭代地搜索合适的场所,一次构建单个零件步骤。与以前的方法相反,我们的方法结合了生成步骤和房屋的检索,允许模型利用中间结论,并减轻基线编码器模型的输入大小限制。我们使用IntailmentBank数据集进行实验,在该数据集中,我们在前提检索和索引树上的现有基准优于现有的基准,总体正确性增长了约300%。
translated by 谷歌翻译
光保护综合技术的快速进展达到了真实和操纵图像之间的边界开始模糊的临界点。最近,一个由Mega-Scale Deep Face Forgery DataSet,由290万个图像组成和221,247个视频的伪造网络已被释放。它是迄今为止的数据规模,操纵(7个图像级别方法,8个视频级别方法),扰动(36个独立和更混合的扰动)和注释(630万个分类标签,290万操纵区域注释和221,247个时间伪造段标签)。本文报告了Forgerynet-Face Forgery Analysis挑战2021的方法和结果,它采用了伪造的基准。模型评估在私人测试集上执行离线。共有186名参加比赛的参与者,11名队伍提交了有效的提交。我们将分析排名排名的解决方案,并展示一些关于未来工作方向的讨论。
translated by 谷歌翻译
与基于现代聚类算法的完全监督的REID方法相比,未经监督的人重新识别(U-Reid)最近达到了竞争性能。然而,这种基于聚类的方案对大规模数据集来说变得对计算方式。如何探讨如何有效利用具有有限计算资源的无限未标记的数据,以便更好地进行更好的U-Reid。在本文中,我们首次尝试大规模U-Reid并提出一个“大型任务的小数据”范式被称为Meta聚类学习(MCL)。 MCL仅通过群集伪标记整个未标记数据的子集,以节省第一期训练的计算。之后,被学习的集群中心称为我们的MCL中的元原型,被视为代理注释器,以便轻松注释其它未标记数据以进一步抛光模型。为了缓解抛光阶段的潜在嘈杂的标签问题,我们强制执行两个精心设计的损失限制,以保证境内统一的一致性和相互识别的强烈相关性。对于多个广泛使用的U-REID基准测试,我们的方法显着节省了计算成本,同时与先前作品相比,实现了可比或更好的性能。
translated by 谷歌翻译
布换人员重新识别(CC-REID)旨在在长时间匹配不同地点的同一个人,例如,超过日子,因此不可避免地满足换衣服的挑战。在本文中,我们专注于处理更具有挑战性的环境下的CC-Reid问题,即,只有一个图像,它可以实现高效和延迟的行人确定实时监控应用。具体而言,我们将步态识别作为辅助任务来驱动图像Reid模型来通过利用个人独特和独立布的步态信息来学习布不可知的表现,我们将此框架命名为Gi-Reid。 Gi-Reid采用两流架构,该架构由图像Reid-Stream和辅助步态识别流(步态流)组成。在推理的高计算效率中丢弃的步态流充当调节器,以鼓励在训练期间捕获捕获布不变的生物识别运动特征。为了从单个图像获取时间连续运动提示,我们设计用于步态流的步态序列预测(GSP)模块,以丰富步态信息。最后,为有效的知识正则化强制执行两个流的高级语义一致性。基于多种图像的布更换Reid基准测试的实验,例如LTCC,PRCC,Real28和VC衣服,证明了GI-REID对最先进的人来说。代码在https://github.com/jinx-ustc/gi -reid提供。
translated by 谷歌翻译
The ImageNet Large Scale Visual Recognition Challenge is a benchmark in object category classification and detection on hundreds of object categories and millions of images. The challenge has been run annually from 2010 to present, attracting participation from more than fifty institutions. This paper describes the creation of this benchmark dataset and the advances in object recognition that have been possible as a result. We discuss the chal-
translated by 谷歌翻译