测试时间增强 - 跨测试输入示例的预测的聚合 - 是一种改善图像分类模型性能的既定技术。重要的是,TTA可用于改善事后模型性能,而无需额外的培训。尽管可以将测试时间增强(TTA)应用于任何数据模式,但它在NLP中的采用有限,部分原因是难以识别标签保护转换。在本文中,我们提出的增强政策可以通过语言模型进行大量准确的改进。一个关键发现是,增强政策设计(例如,从单个,非确定性增强产生的样本数量)对TTA的好处有很大的影响。跨二进制分类任务和数据集进行的实验表明,测试时间的增加可以对当前最新方法进行一致的改进。
translated by 谷歌翻译
单细胞转录组学的分析通常依赖于聚类细胞,然后进行差异基因表达(DGE)来识别这些簇之间变化的基因。这些离散分析成功地确定了细胞类型和标记。但是,可能无法检测到细胞类型内部和之间的连续变化。我们提出了三种拓扑动机的数学方法,用于无监督的特征选择,这些方法可以同时在多个尺度上同时考虑离散和连续的转录模式。 eigenscores($ \ mathrm {eig} _i $)基于其与图形laplacian的频谱分解在数据中与低频内在图案的对应相对的对应。多尺度拉普拉斯评分(MLS)是一种无监督的方法,用于在数据中定位相关量表并选择在这些相应量表上相干表达的基因。持续的瑞利商(PRQ)采用了配备过滤的数据,允许在分叉过程中具有不同作用的基因(例如伪时间)。我们通过将它们应用于已发布的单细胞转录组数据集来证明这些技术的实用性。该方法验证了先前鉴定的基因并检测具有相干表达模式的其他基因。通过研究基因信号与基础空间的几何形状之间的相互作用,这三种方法给出了基因的多维排名和它们之间关系的可视化。
translated by 谷歌翻译
结合PersonAs信息允许在对话响应生成中多样化和接触响应。不幸的是,事先作品主要专注于自我的人物,并忽视了合作伙伴角色的价值。此外,在实际应用中,实际伙伴角色的可用性通常不是这种情况。本文试图通过提供一种新颖的框架来解决这些问题,这些框架利用自动合作伙伴角色生成来增强成功的对话一代。我们将强化学习纳入了一个专门设计的批评网络,以获得奖励判断。自动和人类评估的实验结果表明a)我们的框架能够产生相关,信息丰富的合作伙伴角色,甚至与地面真理合作伙伴角色相比。 b)生成的合作伙伴角色增强了后续的响应生成,从而超越了当在推理阶段缺少合作伙伴角色时超越了我们的基线和比较模型。 c)我们的框架在推理期间产生的响应比我们的基线在地面真理合作伙伴角色上的基线更具信息丰富和参与。 d)我们专门设计的批评批评网络有效地加强了我们的框架。最后,我们的框架提供了更好的解释性,并降低了对伙伴角色的外部数据库的需求。
translated by 谷歌翻译
Temporal reasoning is the task of predicting temporal relations of event pairs with corresponding contexts. While some temporal reasoning models perform reasonably well on in-domain benchmarks, we have little idea of the systems' generalizability due to existing datasets' limitations. In this work, we introduce a novel task named TODAY that bridges this gap with temporal differential analysis, which as the name suggests, evaluates if systems can correctly understand the effect of incremental changes. Specifically, TODAY makes slight context changes for given event pairs, and systems need to tell how this subtle contextual change will affect temporal relation distributions. To facilitate learning, TODAY also annotates human explanations. We show that existing models, including GPT-3, drop to random guessing on TODAY, suggesting that they heavily rely on spurious information rather than proper reasoning for temporal predictions. On the other hand, we show that TODAY's supervision style and explanation annotations can be used in joint learning and encourage models to use more appropriate signals during training and outperform across several benchmarks. TODAY can also be used to train models to solicit incidental supervision from noisy sources such as GPT-3 and moves farther towards generic temporal reasoning systems.
translated by 谷歌翻译
As the size of the dataset used in deep learning tasks increases, the noisy label problem, which is a task of making deep learning robust to the incorrectly labeled data, has become an important task. In this paper, we propose a method of learning noisy label data using the label noise selection with test-time augmentation (TTA) cross-entropy and classifier learning with the NoiseMix method. In the label noise selection, we propose TTA cross-entropy by measuring the cross-entropy to predict the test-time augmented training data. In the classifier learning, we propose the NoiseMix method based on MixUp and BalancedMix methods by mixing the samples from the noisy and the clean label data. In experiments on the ISIC-18 public skin lesion diagnosis dataset, the proposed TTA cross-entropy outperformed the conventional cross-entropy and the TTA uncertainty in detecting label noise data in the label noise selection process. Moreover, the proposed NoiseMix not only outperformed the state-of-the-art methods in the classification performance but also showed the most robustness to the label noise in the classifier learning.
translated by 谷歌翻译
Uncertainty estimation of the trained deep learning network provides important information for improving the learning efficiency or evaluating the reliability of the network prediction. In this paper, we propose a method for the uncertainty estimation for multi-class image classification using test-time mixup augmentation (TTMA). To improve the discrimination ability between the correct and incorrect prediction of the existing aleatoric uncertainty, we propose the data uncertainty by applying the mixup augmentation on the test data and measuring the entropy of the histogram of predicted labels. In addition to the data uncertainty, we propose a class-specific uncertainty presenting the aleatoric uncertainty associated with the specific class, which can provide information on the class confusion and class similarity of the trained network. The proposed methods are validated on two public datasets, the ISIC-18 skin lesion diagnosis dataset, and the CIFAR-100 real-world image classification dataset. The experiments demonstrate that (1) the proposed data uncertainty better separates the correct and incorrect prediction than the existing uncertainty measures thanks to the mixup perturbation, and (2) the proposed class-specific uncertainty provides information on the class confusion and class similarity of the trained network for both datasets.
translated by 谷歌翻译
This paper examines the encoding of analogy in large-scale pretrained language models, such as BERT and GPT-2. Existing analogy datasets typically focus on a limited set of analogical relations, with a high similarity of the two domains between which the analogy holds. As a more realistic setup, we introduce the Scientific and Creative Analogy dataset (SCAN), a novel analogy dataset containing systematic mappings of multiple attributes and relational structures across dissimilar domains. Using this dataset, we test the analogical reasoning capabilities of several widely-used pretrained language models (LMs). We find that state-of-the-art LMs achieve low performance on these complex analogy tasks, highlighting the challenges still posed by analogy understanding.
translated by 谷歌翻译
Rates of missing data often depend on record-keeping policies and thus may change across times and locations, even when the underlying features are comparatively stable. In this paper, we introduce the problem of Domain Adaptation under Missingness Shift (DAMS). Here, (labeled) source data and (unlabeled) target data would be exchangeable but for different missing data mechanisms. We show that when missing data indicators are available, DAMS can reduce to covariate shift. Focusing on the setting where missing data indicators are absent, we establish the following theoretical results for underreporting completely at random: (i) covariate shift is violated (adaptation is required); (ii) the optimal source predictor can perform worse on the target domain than a constant one; (iii) the optimal target predictor can be identified, even when the missingness rates themselves are not; and (iv) for linear models, a simple analytic adjustment yields consistent estimates of the optimal target parameters. In experiments on synthetic and semi-synthetic data, we demonstrate the promise of our methods when assumptions hold. Finally, we discuss a rich family of future extensions.
translated by 谷歌翻译
It has been experimentally demonstrated that humans are able to learn in a manner that allows them to make predictions on categories for which they have not seen any examples (Malaviya et al., 2022). Sucholutsky and Schonlau (2020) have recently presented a machine learning approach that aims to do the same. They utilise synthetically generated data and demonstrate that it is possible to achieve sub-linear scaling and develop models that can learn to recognise N classes from M training samples where M is less than N - aka less-than-one shot learning. Their method was, however, defined for univariate or simple multivariate data (Sucholutsky et al., 2021). We extend it to work on large, high-dimensional and real-world datasets and empirically validate it in this new and challenging setting. We apply this method to learn previously unseen NLP tasks from very few examples (4, 8 or 16). We first generate compact, sophisticated less-than-one shot representations called soft-label prototypes which are fitted on training data, capturing the distribution of different classes across the input domain space. We then use a modified k-Nearest Neighbours classifier to demonstrate that soft-label prototypes can classify data competitively, even outperforming much more computationally complex few-shot learning methods.
translated by 谷歌翻译
新的AUV技术的发展增加了AUV可以应对的任务范围及其运营的长度。结果,AUV能够处理高度复杂的操作。但是,这些任务并不容易适合将任务定义为一系列预先计划的航路点的传统方法,因为不可能事先知道,在任务过程中可能发生的一切。这会导致操作员的期望和实际操作绩效之间存在差距。因此,这可能会在操作员和AUV之间产生降低的信任程度,从而导致不必要的任务中断。为了弥合机器人行为和运营商的期望之间的这一差距,这项工作旨在提供一个框架,以易于理解的方式解释任务期间自动驾驶汽车采取的决策和行动。此外,目的是拥有一个自治性系统,可以在任何自治体系结构之上添加为附加层。为了使该方法适用于配备不同自主权的不同自主系统,这项工作将自主权的内部运作与决策点以及应用知识蒸馏的由此产生的执行动作。最后,为了以更自然的方式向操作员介绍解释,蒸馏决策树的输出与自然语言解释相结合,并将其报告给操作员作为句子。因此,在解释管道的末尾添加了一个称为Concept2Text生成的附加步骤。
translated by 谷歌翻译