This article explores and analyzes the unsupervised clustering of large partially observed graphs. We propose a scalable and provable randomized framework for clustering graphs generated from the stochastic block model. The clustering is first applied to a sub-matrix of the graph's adjacency matrix associated with a reduced graph sketch constructed using random sampling. Then, the clusters of the full graph are inferred based on the clusters extracted from the sketch using a correlation-based retrieval step. Uniform random node sampling is shown to improve the computational complexity over clustering of the full graph when the cluster sizes are balanced. A new random degree-based node sampling algorithm is presented which significantly improves upon the performance of the clustering algorithm even when clusters are unbalanced. This framework improves the phase transitions for matrix-decomposition-based clustering with regard to computational complexity and minimum cluster size, which are shown to be nearly dimension-free in the low inter-cluster connectivity regime. A third sampling technique is shown to improve balance by randomly sampling nodes based on spatial distribution. We provide analysis and numerical results using a convex clustering algorithm based on matrix completion.
translated by 谷歌翻译
Transformer variants dominate the state-of-the-art in different natural language processing tasks such as translation, reading comprehension and summarization. Our paper is more directed to use general memory slots added to the inputs and studying the results of adding these slots. This paper is a go on study of general memory slots rule that were added to the input of the proposed model in previous work. We have two main tasks;1) pretraining task using masked language modeling and b) fine tuning task using HotpotQA . This study aims to verify the ability of the proposed model to handle chunks as if they were one chunk comparing with the base model. As baseline we used T5 transformer. We studied the rule of memory slots augmented to each input chunk and studied the model performance without selector. We found that adding memory to input chunks helped the proposed model to overcome the baseline on Masked language modeling task with specific training parameters. Ablation study reveals the ability of using the compressed input chunks with a degradation in performance.
translated by 谷歌翻译
Modern deep neural networks tend to be evaluated on static test sets. One shortcoming of this is the fact that these deep neural networks cannot be easily evaluated for robustness issues with respect to specific scene variations. For example, it is hard to study the robustness of these networks to variations of object scale, object pose, scene lighting and 3D occlusions. The main reason is that collecting real datasets with fine-grained naturalistic variations of sufficient scale can be extremely time-consuming and expensive. In this work, we present Counterfactual Simulation Testing, a counterfactual framework that allows us to study the robustness of neural networks with respect to some of these naturalistic variations by building realistic synthetic scenes that allow us to ask counterfactual questions to the models, ultimately providing answers to questions such as "Would your classification still be correct if the object were viewed from the top?" or "Would your classification still be correct if the object were partially occluded by another object?". Our method allows for a fair comparison of the robustness of recently released, state-of-the-art Convolutional Neural Networks and Vision Transformers, with respect to these naturalistic variations. We find evidence that ConvNext is more robust to pose and scale variations than Swin, that ConvNext generalizes better to our simulated domain and that Swin handles partial occlusion better than ConvNext. We also find that robustness for all networks improves with network scale and with data scale and variety. We release the Naturalistic Variation Object Dataset (NVD), a large simulated dataset of 272k images of everyday objects with naturalistic variations such as object pose, scale, viewpoint, lighting and occlusions. Project page: https://counterfactualsimulation.github.io
translated by 谷歌翻译
Continual Learning is a step towards lifelong intelligence where models continuously learn from recently collected data without forgetting previous knowledge. Existing continual learning approaches mostly focus on image classification in the class-incremental setup with clear task boundaries and unlimited computational budget. This work explores Online Domain-Incremental Continual Segmentation~(ODICS), a real-world problem that arises in many applications, \eg, autonomous driving. In ODICS, the model is continually presented with batches of densely labeled images from different domains; computation is limited and no information about the task boundaries is available. In autonomous driving, this may correspond to the realistic scenario of training a segmentation model over time on a sequence of cities. We analyze several existing continual learning methods and show that they do not perform well in this setting despite working well in class-incremental segmentation. We propose SimCS, a parameter-free method complementary to existing ones that leverages simulated data as a continual learning regularizer. Extensive experiments show consistent improvements over different types of continual learning methods that use regularizers and even replay.
translated by 谷歌翻译
在大型数据集上,对视力任务的深度学习模型进行了培训,因为存在一个通用表示,可用于对所有样本进行预测。尽管事实证明,高复杂性模型能够学习此类表示,但对数据的特定子集进行了培训的专家,可以更有效地推断出标签。然而,使用专家的混合物会提出两个新问题,即(i)在提出新的看不见的样本时分配正确的专家。 (ii)找到培训数据的最佳分区,以使专家最依赖于共同特征。在动态路由(DR)中,提出了一个新颖的体系结构,其中每层由一组专家组成,但是在没有解决这两个挑战的情况下,我们证明该模型可以恢复使用相同的专家子集。在我们的方法中,对多元化的动态路由(DIVDR)进行了明确培训,以解决找到数据相关分区并以无监督的方法分配正确的专家的挑战。我们对MS-Coco的城市景观和对象检测以及实例分割进行了几项实验,显示了几个基线的性能的改善。
translated by 谷歌翻译
我们提出了一个新的灵敏度分析模型,该模型结合了Copulas和在未观察到的混杂状态下的因果推断的标准化。我们将新模型称为$ \ rho $ -gnf($ \ rho $ - graphical正常化流),其中$ \ rho {\ in} [ - 1,+1] $是一个有界灵敏度参数,表示后门非 - 由于未观察到的混杂而引起的因果关系,使用研究最丰富且广泛流行的高斯副群建模。具体而言,$ \ rho $ -gnf使我们能够估计和分析前门因果效应或平均因果效应(ACE)作为$ \ rho $的函数。我们将其称为$ \ rho_ {curve} $。 $ \ rho_ {curve} $使我们能够指定无王牌所需的混杂力量。我们将其称为$ \ rho_ {value} $。此外,$ \ rho_ {curve} $还使我们能够为$ \ rho $ values的间隔提供ACE的界限。我们说明了$ \ rho $ -gnf的好处,并通过对我们的经验王牌界限的实验比其他流行的王牌范围更狭窄。
translated by 谷歌翻译
作物疾病显着影响农业生产的数量和质量。在精确农业的目标是最大程度地减少甚至避免使用农药的目的,具有深度学习的天气和遥感数据可以在检测作物疾病中发挥关键作用,从而允许对农作物的局部治疗。但是,将天气和图像等异质数据结合在一起仍然是一个热门话题和具有挑战性的任务。变压器体系结构的最新发展显示了从不同领域(例如文本图像)融合数据的可能性。当前的趋势是仅定制一个变压器来创建多模式融合模型。相反,我们提出了一种使用三个变压器实现数据融合的新方法。在本文中,我们首先通过使用ConvlstM模型来插值来解决缺失的卫星图像问题。然后,提出了一种多模式融合体系结构,该体系结构共同学习处理视觉和天气信息。该体系结构是由三个主要组件,一个视觉变压器和两个变压器编码器构建的,可以融合图像和天气方式。所提出的方法的结果有望达到97 \%的总体准确性。
translated by 谷歌翻译
分类器的性能通常是根据测试数据的平均准确性来衡量的。尽管是标准措施,但平均准确性未能表征模型对标签的基本条件定律的拟合度,鉴于特征向量($ y | x $),例如由于模型错误指定,拟合和高维度。在本文中,我们考虑了评估通用二元分类器的拟合优点的基本问题。我们的框架对条件定律$ y | x $没有任何参数假设,并且将其视为黑匣子甲骨文模型,只能通过查询访问。我们将拟合优度评估问题提出作为表格\ [h_0:\ mathbb {e} \ big [d_f \ big({\ sf bern}(\ esta(x))\ | {\ | {\ | {\ | { sf bern}(\ hat {\ eta}(x))\ big)\ big] \ leq \ tau \ ,, \],其中$ d_f $代表$ f $ -DDIVERGENCE函数,$ \ eta(x)$ ,$ \ hat {\ eta}(x)$分别表示功能向量$ x $的真实和估计可能性。我们提出了一个新颖的测试,称为\ grasp用于测试$ H_0 $,无论功能如何(无分配)在有限的样品设置中起作用。我们还提出了为模型-X设置设计的Model-X \ Grasp,其中已知特征向量的联合分布。 Model-X \ Grasp使用此分配信息来实现更好的功率。我们通过广泛的数值实验评估测试的性能。
translated by 谷歌翻译
在整个大学附近,教师必须在每个学期中选择一些合格的教授参加尊敬的课程。从这个意义上讲,考虑了教学经验,学术培训,竞争等的因素。这项工作通常是由专家(例如教师董事)完成的。到目前为止,已经提出了几个半自动系统来协助头部。在本文中,开发了一个全自动规则的专家系统。拟议的专家系统包括三个主要阶段。首先,将人类专家的知识输入并设计为决策树。在第二步中,根据生成的决策树的提供规则设计了专家系统。在第三步中,提出了一种算法,以根据专家的质量来加重树的结果。为了提高专家系统的性能,开发了大多数投票算法作为后期制作的步骤,以选择满足每门课程最专业决策树的合格培训师。使用伊朗大学的真实数据评估拟议的专家系统的质量。计算出的准确率为85.55,证明了所提出的系统的鲁棒性和准确性。与相关的有效工作相比,所提出的系统几乎没有计算复杂性。此外,简单的实现和透明框是提出系统的其他功能。
translated by 谷歌翻译
监测种子成熟度是由于气候变化和更加限制的实践而导致农业的越来越多的挑战。在野外监测的种子监测对于优化农业过程并通过高发芽来保证产量质量至关重要。传统方法基于在现场和实验室分析中的采样有限。此外,它们很耗时,仅允许监视作物领域的子段。这导致由于场内异质性而缺乏整体作物状况的准确性。无人机的多光谱图像可以统一扫描田地,并更好地捕获作物成熟度信息。另一方面,深度学习方法在估计农艺参数(尤其是成熟度)方面显示出巨大的潜力。但是,它们需要大型标记的数据集。尽管可以使用大量的航空图像,但用地面真理标记它们是一个乏味的,即使不是不可能的任务。在本文中,我们提出了一种使用多光谱无人机图像来估算欧芹种子成熟度的方法,并采用新的自动数据标记方法。这种方法基于参数和非参数模型,以提供弱标签。我们还考虑了该方法的不同步骤的数据采集协议和性能评估。结果显示出良好的性能,非参数核密度估计器模型可以在用作标记方法时改善神经网络的概括,从而导致更健壮和更好地执行深层神经模型。
translated by 谷歌翻译