分类器的性能通常是根据测试数据的平均准确性来衡量的。尽管是标准措施,但平均准确性未能表征模型对标签的基本条件定律的拟合度,鉴于特征向量($ y | x $),例如由于模型错误指定,拟合和高维度。在本文中,我们考虑了评估通用二元分类器的拟合优点的基本问题。我们的框架对条件定律$ y | x $没有任何参数假设,并且将其视为黑匣子甲骨文模型,只能通过查询访问。我们将拟合优度评估问题提出作为表格\ [h_0:\ mathbb {e} \ big [d_f \ big({\ sf bern}(\ esta(x))\ | {\ | {\ | {\ | { sf bern}(\ hat {\ eta}(x))\ big)\ big] \ leq \ tau \ ,, \],其中$ d_f $代表$ f $ -DDIVERGENCE函数,$ \ eta(x)$ ,$ \ hat {\ eta}(x)$分别表示功能向量$ x $的真实和估计可能性。我们提出了一个新颖的测试,称为\ grasp用于测试$ H_0 $,无论功能如何(无分配)在有限的样品设置中起作用。我们还提出了为模型-X设置设计的Model-X \ Grasp,其中已知特征向量的联合分布。 Model-X \ Grasp使用此分配信息来实现更好的功率。我们通过广泛的数值实验评估测试的性能。
translated by 谷歌翻译
成功的深度学习模型往往涉及培训具有比训练样本数量更多的参数的神经网络架构。近年来已经广泛研究了这种超分子化的模型,并且通过双下降现象和通过优化景观的结构特性,从统计的角度和计算视角都建立了过分统计化的优点。尽管在过上分层的制度中深入学习架构的显着成功,但也众所周知,这些模型对其投入中的小对抗扰动感到高度脆弱。即使在普遍培训的情况下,它们在扰动输入(鲁棒泛化)上的性能也会比良性输入(标准概括)的最佳可达到的性能更糟糕。因此,必须了解如何从根本上影响稳健性的情况下如何影响鲁棒性。在本文中,我们将通过专注于随机特征回归模型(具有随机第一层权重的两层神经网络)来提供超分度化对鲁棒性的作用的精确表征。我们考虑一个制度,其中样本量,输入维度和参数的数量彼此成比例地生长,并且当模型发生前列地训练时,可以为鲁棒泛化误差导出渐近精确的公式。我们的发达理论揭示了过分统计化对鲁棒性的非竞争效果,表明对于普遍训练的随机特征模型,高度公正化可能会损害鲁棒泛化。
translated by 谷歌翻译
在本文中,我们研究了学习最适合培训数据集的浅层人工神经网络的问题。我们在过度参数化的制度中研究了这个问题,在该制度中,观测值的数量少于模型中的参数数量。我们表明,通过二次激活,训练的优化景观这种浅神经网络具有某些有利的特征,可以使用各种局部搜索启发式方法有效地找到全球最佳模型。该结果适用于输入/输出对的任意培训数据。对于可区分的激活函数,我们还表明,适当初始化的梯度下降以线性速率收敛到全球最佳模型。该结果着重于选择输入的可实现模型。根据高斯分布和标签是根据种植的重量系数生成的。
translated by 谷歌翻译
Transformer variants dominate the state-of-the-art in different natural language processing tasks such as translation, reading comprehension and summarization. Our paper is more directed to use general memory slots added to the inputs and studying the results of adding these slots. This paper is a go on study of general memory slots rule that were added to the input of the proposed model in previous work. We have two main tasks;1) pretraining task using masked language modeling and b) fine tuning task using HotpotQA . This study aims to verify the ability of the proposed model to handle chunks as if they were one chunk comparing with the base model. As baseline we used T5 transformer. We studied the rule of memory slots augmented to each input chunk and studied the model performance without selector. We found that adding memory to input chunks helped the proposed model to overcome the baseline on Masked language modeling task with specific training parameters. Ablation study reveals the ability of using the compressed input chunks with a degradation in performance.
translated by 谷歌翻译
Modern deep neural networks tend to be evaluated on static test sets. One shortcoming of this is the fact that these deep neural networks cannot be easily evaluated for robustness issues with respect to specific scene variations. For example, it is hard to study the robustness of these networks to variations of object scale, object pose, scene lighting and 3D occlusions. The main reason is that collecting real datasets with fine-grained naturalistic variations of sufficient scale can be extremely time-consuming and expensive. In this work, we present Counterfactual Simulation Testing, a counterfactual framework that allows us to study the robustness of neural networks with respect to some of these naturalistic variations by building realistic synthetic scenes that allow us to ask counterfactual questions to the models, ultimately providing answers to questions such as "Would your classification still be correct if the object were viewed from the top?" or "Would your classification still be correct if the object were partially occluded by another object?". Our method allows for a fair comparison of the robustness of recently released, state-of-the-art Convolutional Neural Networks and Vision Transformers, with respect to these naturalistic variations. We find evidence that ConvNext is more robust to pose and scale variations than Swin, that ConvNext generalizes better to our simulated domain and that Swin handles partial occlusion better than ConvNext. We also find that robustness for all networks improves with network scale and with data scale and variety. We release the Naturalistic Variation Object Dataset (NVD), a large simulated dataset of 272k images of everyday objects with naturalistic variations such as object pose, scale, viewpoint, lighting and occlusions. Project page: https://counterfactualsimulation.github.io
translated by 谷歌翻译
Continual Learning is a step towards lifelong intelligence where models continuously learn from recently collected data without forgetting previous knowledge. Existing continual learning approaches mostly focus on image classification in the class-incremental setup with clear task boundaries and unlimited computational budget. This work explores Online Domain-Incremental Continual Segmentation~(ODICS), a real-world problem that arises in many applications, \eg, autonomous driving. In ODICS, the model is continually presented with batches of densely labeled images from different domains; computation is limited and no information about the task boundaries is available. In autonomous driving, this may correspond to the realistic scenario of training a segmentation model over time on a sequence of cities. We analyze several existing continual learning methods and show that they do not perform well in this setting despite working well in class-incremental segmentation. We propose SimCS, a parameter-free method complementary to existing ones that leverages simulated data as a continual learning regularizer. Extensive experiments show consistent improvements over different types of continual learning methods that use regularizers and even replay.
translated by 谷歌翻译
在大型数据集上,对视力任务的深度学习模型进行了培训,因为存在一个通用表示,可用于对所有样本进行预测。尽管事实证明,高复杂性模型能够学习此类表示,但对数据的特定子集进行了培训的专家,可以更有效地推断出标签。然而,使用专家的混合物会提出两个新问题,即(i)在提出新的看不见的样本时分配正确的专家。 (ii)找到培训数据的最佳分区,以使专家最依赖于共同特征。在动态路由(DR)中,提出了一个新颖的体系结构,其中每层由一组专家组成,但是在没有解决这两个挑战的情况下,我们证明该模型可以恢复使用相同的专家子集。在我们的方法中,对多元化的动态路由(DIVDR)进行了明确培训,以解决找到数据相关分区并以无监督的方法分配正确的专家的挑战。我们对MS-Coco的城市景观和对象检测以及实例分割进行了几项实验,显示了几个基线的性能的改善。
translated by 谷歌翻译
我们提出了一个新的灵敏度分析模型,该模型结合了Copulas和在未观察到的混杂状态下的因果推断的标准化。我们将新模型称为$ \ rho $ -gnf($ \ rho $ - graphical正常化流),其中$ \ rho {\ in} [ - 1,+1] $是一个有界灵敏度参数,表示后门非 - 由于未观察到的混杂而引起的因果关系,使用研究最丰富且广泛流行的高斯副群建模。具体而言,$ \ rho $ -gnf使我们能够估计和分析前门因果效应或平均因果效应(ACE)作为$ \ rho $的函数。我们将其称为$ \ rho_ {curve} $。 $ \ rho_ {curve} $使我们能够指定无王牌所需的混杂力量。我们将其称为$ \ rho_ {value} $。此外,$ \ rho_ {curve} $还使我们能够为$ \ rho $ values的间隔提供ACE的界限。我们说明了$ \ rho $ -gnf的好处,并通过对我们的经验王牌界限的实验比其他流行的王牌范围更狭窄。
translated by 谷歌翻译
作物疾病显着影响农业生产的数量和质量。在精确农业的目标是最大程度地减少甚至避免使用农药的目的,具有深度学习的天气和遥感数据可以在检测作物疾病中发挥关键作用,从而允许对农作物的局部治疗。但是,将天气和图像等异质数据结合在一起仍然是一个热门话题和具有挑战性的任务。变压器体系结构的最新发展显示了从不同领域(例如文本图像)融合数据的可能性。当前的趋势是仅定制一个变压器来创建多模式融合模型。相反,我们提出了一种使用三个变压器实现数据融合的新方法。在本文中,我们首先通过使用ConvlstM模型来插值来解决缺失的卫星图像问题。然后,提出了一种多模式融合体系结构,该体系结构共同学习处理视觉和天气信息。该体系结构是由三个主要组件,一个视觉变压器和两个变压器编码器构建的,可以融合图像和天气方式。所提出的方法的结果有望达到97 \%的总体准确性。
translated by 谷歌翻译
监测种子成熟度是由于气候变化和更加限制的实践而导致农业的越来越多的挑战。在野外监测的种子监测对于优化农业过程并通过高发芽来保证产量质量至关重要。传统方法基于在现场和实验室分析中的采样有限。此外,它们很耗时,仅允许监视作物领域的子段。这导致由于场内异质性而缺乏整体作物状况的准确性。无人机的多光谱图像可以统一扫描田地,并更好地捕获作物成熟度信息。另一方面,深度学习方法在估计农艺参数(尤其是成熟度)方面显示出巨大的潜力。但是,它们需要大型标记的数据集。尽管可以使用大量的航空图像,但用地面真理标记它们是一个乏味的,即使不是不可能的任务。在本文中,我们提出了一种使用多光谱无人机图像来估算欧芹种子成熟度的方法,并采用新的自动数据标记方法。这种方法基于参数和非参数模型,以提供弱标签。我们还考虑了该方法的不同步骤的数据采集协议和性能评估。结果显示出良好的性能,非参数核密度估计器模型可以在用作标记方法时改善神经网络的概括,从而导致更健壮和更好地执行深层神经模型。
translated by 谷歌翻译