多跳问题回答(QA)需要对多个文档进行推理,以回答一个复杂的问题并提供可解释的支持证据。但是,提供支持证据不足以证明模型已经执行了所需的推理来达到正确的答案。大多数现有的多跳质量检查方法也无法回答大部分子问题,即使他们的父母问题得到了正确的回答。在本文中,我们为多跳QA提出了基于及时的保护学习(PCL)框架,该框架从多跳QA任务中获取了新知识,同时保留了在单跳QA任务上学习的旧知识,从而减轻了遗忘。具体来说,我们首先在现有的单跳质量检查任务上训练模型,然后冻结该模型,并通过为多跳质量检查任务分配其他子网络来扩展它。此外,为了调整预训练的语言模型以刺激特定多跳问题所需的推理类型,我们学习了新型子网络的软提示,以执行特定于类型的推理。 HOTPOTQA基准测试的实验结果表明,PCL具有多跳质量质量质量检查的竞争力,并且在相应的单跳子问题上保留了良好的性能,这表明PCL通过忘记通过忘记来减轻知识丧失的功效。
translated by 谷歌翻译
自动问题质量评级(AQQR)旨在通过计算手段来评估问题质量,从而解决在线学习者的问题存储库中的出现挑战。 AQQR的现有方法仅依赖于明确定义的标准,如可读性和字数,同时不充分利用最先进的深度学习技术。我们提出DeepQR,这是一个用于AQQR的新型神经网络模型,该模型是使用从PEERWOSE,一个广泛使用的学习者平台收集的多项选择问题(MCQ)数据集进行培训的AQQR。除了设计DeepQR之外,我们还研究了基于明确定义的功能或语义功能或两者的模型。我们还引入了一种自我关注机制来捕获MCQ组件之间的语义相关性,以及使用质量评级获取问题表示的对比学习方法。从八个大学级课程中收集的数据集进行了广泛的实验,说明DeepQR在六种比较模型方面具有卓越的性能。
translated by 谷歌翻译
潜在的语义分析(LSA)和对应分析(CA)是两种使用单数值分解(SVD)来降低维度的技术。 LSA已广泛用于获得低维表示,以捕获文档和术语之间的关系。在本文中,我们介绍了文档矩阵中两种技术的理论分析和比较。我们表明,与LSA相比,CA具有一些吸引人的特性,例如,有效消除了由于文档长度和期限频率而产生的边距的影响,因此CA解决方案非常适合于文档和条款之间的关系。提出了一个统一的框架,其中包括CA和LSA作为特殊情况。我们从经验上将CA与荷兰历史文本中的英语和作者身份归因的文本分类进行了与CA进行比较,并发现CA的性能明显更好。我们还将CA应用于一个关于荷兰国歌威廉斯(Wilhelmus)的作者身份的长期问题,并提供了进一步的支持,可以将其归因于作者,在几位竞争者中。
translated by 谷歌翻译
Mapping with uncertainty representation is required in many research domains, such as localization and sensor fusion. Although there are many uncertainty explorations in pose estimation of an ego-robot with map information, the quality of the reference maps is often neglected. To avoid the potential problems caused by the errors of maps and a lack of the uncertainty quantification, an adequate uncertainty measure for the maps is required. In this paper, uncertain building models with abstract map surface using Gaussian Process (GP) is proposed to measure the map uncertainty in a probabilistic way. To reduce the redundant computation for simple planar objects, extracted facets from a Gaussian Mixture Model (GMM) are combined with the implicit GP map while local GP-block techniques are used as well. The proposed method is evaluated on LiDAR point clouds of city buildings collected by a mobile mapping system. Compared to the performances of other methods such like Octomap, Gaussian Process Occupancy Map (GPOM) and Bayersian Generalized Kernel Inference (BGKOctomap), our method has achieved higher Precision-Recall AUC for evaluated buildings.
translated by 谷歌翻译
How to solve the data scarcity problem for end-to-end speech-to-text translation (ST)? It's well known that data augmentation is an efficient method to improve performance for many tasks by enlarging the dataset. In this paper, we propose Mix at three levels for Speech Translation (M^3ST) method to increase the diversity of the augmented training corpus. Specifically, we conduct two phases of fine-tuning based on a pre-trained model using external machine translation (MT) data. In the first stage of fine-tuning, we mix the training corpus at three levels, including word level, sentence level and frame level, and fine-tune the entire model with mixed data. At the second stage of fine-tuning, we take both original speech sequences and original text sequences in parallel into the model to fine-tune the network, and use Jensen-Shannon divergence to regularize their outputs. Experiments on MuST-C speech translation benchmark and analysis show that M^3ST outperforms current strong baselines and achieves state-of-the-art results on eight directions with an average BLEU of 29.9.
translated by 谷歌翻译
Positive-Unlabeled (PU) learning tries to learn binary classifiers from a few labeled positive examples with many unlabeled ones. Compared with ordinary semi-supervised learning, this task is much more challenging due to the absence of any known negative labels. While existing cost-sensitive-based methods have achieved state-of-the-art performances, they explicitly minimize the risk of classifying unlabeled data as negative samples, which might result in a negative-prediction preference of the classifier. To alleviate this issue, we resort to a label distribution perspective for PU learning in this paper. Noticing that the label distribution of unlabeled data is fixed when the class prior is known, it can be naturally used as learning supervision for the model. Motivated by this, we propose to pursue the label distribution consistency between predicted and ground-truth label distributions, which is formulated by aligning their expectations. Moreover, we further adopt the entropy minimization and Mixup regularization to avoid the trivial solution of the label distribution consistency on unlabeled data and mitigate the consequent confirmation bias. Experiments on three benchmark datasets validate the effectiveness of the proposed method.Code available at: https://github.com/Ray-rui/Dist-PU-Positive-Unlabeled-Learning-from-a-Label-Distribution-Perspective.
translated by 谷歌翻译
We address the problem of synthesizing novel views from a monocular video depicting a complex dynamic scene. State-of-the-art methods based on temporally varying Neural Radiance Fields (aka dynamic NeRFs) have shown impressive results on this task. However, for long videos with complex object motions and uncontrolled camera trajectories, these methods can produce blurry or inaccurate renderings, hampering their use in real-world applications. Instead of encoding the entire dynamic scene within the weights of an MLP, we present a new approach that addresses these limitations by adopting a volumetric image-based rendering framework that synthesizes new viewpoints by aggregating features from nearby views in a scene-motion-aware manner. Our system retains the advantages of prior methods in its ability to model complex scenes and view-dependent effects, but also enables synthesizing photo-realistic novel views from long videos featuring complex scene dynamics with unconstrained camera trajectories. We demonstrate significant improvements over state-of-the-art methods on dynamic scene datasets, and also apply our approach to in-the-wild videos with challenging camera and object motion, where prior methods fail to produce high-quality renderings. Our project webpage is at dynibar.github.io.
translated by 谷歌翻译
Traditional machine learning follows a close-set assumption that the training and test set share the same label space. While in many practical scenarios, it is inevitable that some test samples belong to unknown classes (open-set). To fix this issue, Open-Set Recognition (OSR), whose goal is to make correct predictions on both close-set samples and open-set samples, has attracted rising attention. In this direction, the vast majority of literature focuses on the pattern of open-set samples. However, how to evaluate model performance in this challenging task is still unsolved. In this paper, a systematic analysis reveals that most existing metrics are essentially inconsistent with the aforementioned goal of OSR: (1) For metrics extended from close-set classification, such as Open-set F-score, Youden's index, and Normalized Accuracy, a poor open-set prediction can escape from a low performance score with a superior close-set prediction. (2) Novelty detection AUC, which measures the ranking performance between close-set and open-set samples, ignores the close-set performance. To fix these issues, we propose a novel metric named OpenAUC. Compared with existing metrics, OpenAUC enjoys a concise pairwise formulation that evaluates open-set performance and close-set performance in a coupling manner. Further analysis shows that OpenAUC is free from the aforementioned inconsistency properties. Finally, an end-to-end learning method is proposed to minimize the OpenAUC risk, and the experimental results on popular benchmark datasets speak to its effectiveness.
translated by 谷歌翻译
Precision-Recall曲线(AUPRC)下区域的随机优化是机器学习的关键问题。尽管已经对各种算法进行了广泛研究以进行AUPRC优化,但仅在多Query情况下保证了概括。在这项工作中,我们介绍了随机AUPRC优化的一次性概括中的第一个试验。对于更庞大的概括范围,我们专注于算法依赖性概括。我们目的地都有算法和理论障碍。从算法的角度来看,我们注意到,仅当采样策略偏见时,大多数现有随机估计器才会偏向,并且由于不可兼容性而不稳定。为了解决这些问题,我们提出了一个具有卓越稳定性的采样率不变的无偏随机估计器。最重要的是,AUPRC优化是作为组成优化问题配制的,并提出了随机算法来解决此问题。从理论的角度来看,算法依赖性概括分析的标准技术不能直接应用于这种列表的组成优化问题。为了填补这一空白,我们将模型稳定性从实例损失扩展到列表损失,并弥合相应的概括和稳定性。此外,我们构建状态过渡矩阵以描述稳定性的复发,并通过矩阵频谱简化计算。实际上,关于三个图像检索数据集的实验结果谈到了我们框架的有效性和健全性。
translated by 谷歌翻译
近年来,已取得了巨大进展,以通过半监督学习(SSL)来纳入未标记的数据来克服效率低下的监督问题。大多数最先进的模型是基于对未标记的数据追求一致的模型预测的想法,该模型被称为输入噪声,这称为一致性正则化。尽管如此,对其成功的原因缺乏理论上的见解。为了弥合理论和实际结果之间的差距,我们在本文中提出了SSL的最坏情况一致性正则化技术。具体而言,我们首先提出了针对SSL的概括,该概括由分别在标记和未标记的训练数据上观察到的经验损失项组成。在这种界限的激励下,我们得出了一个SSL目标,该目标可最大程度地减少原始未标记的样本与其多重增强变体之间最大的不一致性。然后,我们提供了一种简单但有效的算法来解决提出的最小问题,从理论上证明它会收敛到固定点。五个流行基准数据集的实验验证了我们提出的方法的有效性。
translated by 谷歌翻译