自上而下的实例分割框架与自下而上的框架相比,它在对象检测方面表现出了优越性。虽然它有效地解决了过度细分,但自上而下的实例分割却遭受了过度处理问题。然而,完整的分割掩模对于生物图像分析至关重要,因为它具有重要的形态特性,例如形状和体积。在本文中,我们提出了一个区域建议纠正(RPR)模块,以解决这个具有挑战性的分割问题。特别是,我们提供了一个渐进式皇家模块,以逐渐将邻居信息引入一系列ROI。 ROI功能被馈入专门的进料网络(FFN)以进行提案框回归。有了其他邻居信息,提出的RPR模块显示了区域建议位置的校正显着改善,因此与最先进的基线方法相比,在三个生物图像数据集上表现出有利的实例分割性能。实验结果表明,所提出的RPR模块在基于锚固的和无锚的自上而下实例分割方法中有效,这表明该方法可以应用于生物学图像的一般自上而下实例分割。代码可用。
translated by 谷歌翻译
无数据知识蒸馏(DFKD)的目的是在没有培训数据的情况下培训从教师网络的轻量级学生网络。现有方法主要遵循生成信息样本的范式,并通过针对数据先验,边界样本或内存样本来逐步更新学生模型。但是,以前的DFKD方法很难在不同的训练阶段动态调整生成策略,这反过来又很难实现高效且稳定的训练。在本文中,我们探讨了如何从课程学习(CL)的角度来教学学生,并提出一种新方法,即“ CUDFKD”,即“使用课程的无数据知识蒸馏”。它逐渐从简单的样本到困难的样本学习,这类似于人类学习的方式。此外,我们还提供了对主要化最小化(MM)算法的理论分析,并解释了CUDFKD的收敛性。在基准数据集上进行的实验表明,使用简单的课程设计策略,CUDFKD可以在最先进的DFKD方法和不同的基准测试中实现最佳性能,例如CIFAR10上RESNET18模型的95.28 \%TOP1的精度,这是更好的而不是从头开始培训数据。训练很快,在30个时期内达到90 \%的最高精度,并且训练期间的差异稳定。同样在本文中,还分析和讨论了CUDFKD的适用性。
translated by 谷歌翻译
语义细分是一种关键技术,涉及高分辨率遥感(HRS)图像的自动解释,并引起了遥感社区的广泛关注。由于其层次表示能力,深度卷积神经网络(DCNN)已成功应用于HRS图像语义分割任务。但是,对大量培训数据的严重依赖性以及对数据分布变化的敏感性严重限制了DCNNS在HRS图像的语义分割中的潜在应用。这项研究提出了一种新型的无监督域适应性语义分割网络(MemoryAdaptnet),用于HRS图像的语义分割。 MemoryAdaptnet构建了一种输出空间对抗学习方案,以弥合源域和目标域之间的域分布差异,并缩小域移位的影响。具体而言,我们嵌入了一个不变的特征内存模块来存储不变的域级上下文信息,因为从对抗学习获得的功能仅代表当前有限输入的变体特征。该模块由类别注意力驱动的不变域级上下文集合模块集成到当前伪不变功能,以进一步增强像素表示。基于熵的伪标签滤波策略用于更新当前目标图像的高额伪不变功能的内存模块。在三个跨域任务下进行的广泛实验表明,我们提出的记忆ADAPTNET非常优于最新方法。
translated by 谷歌翻译
负载预测在电力系统的分析和网格计划中至关重要。因此,我们首先提出一种基于联邦深度学习和非侵入性负载监测(NILM)的家庭负载预测方法。就我们所知,这是基于尼尔姆的家庭负载预测中有关联合学习(FL)的首次研究。在这种方法中,通过非侵入性负载监控将集成功率分解为单个设备功率,并且使用联合深度学习模型分别预测单个设备的功率。最后,将单个设备的预测功率值聚合以形成总功率预测。具体而言,通过单独预测电气设备以获得预测的功率,它可以避免由于单个设备的功率信号的强烈依赖性而造成的误差。在联邦深度学习预测模型中,具有权力数据的家主共享本地模型的参数,而不是本地电源数据,从而保证了家庭用户数据的隐私。案例结果表明,所提出的方法比直接预测整个汇总信号的传统方法提供了更好的预测效果。此外,设计和实施了各种联合学习环境中的实验,以验证该方法的有效性。
translated by 谷歌翻译
尽管近期长尾对象检测成功,但几乎所有长尾对象探测器都是基于两级范式开发的。在实践中,一阶段探测器在行业中更为普遍,因为它们具有简单而快速的管道,易于部署。然而,在长尾情景中,到目前为止,这项工作尚未探讨。在本文中,我们调查了在这种情况下是否可以良好的单级探测器表现良好。我们发现预防一步检测器实现优异性能的主要障碍是:在长尾数据分布下,类别遭受不同程度的正负不平衡问题。传统的焦点损失与所有类别的调制因子相同的调节因子平衡,因此未能处理长尾问题。为了解决这个问题,我们提出了根据其不平衡程度独立地重新平衡不同类别的正面和负样本的损失贡献的均等的联络损失(EFL)。具体而言,EFL采用类别相关调制因子,可以通过不同类别的培训状态来动态调整。对挑战性的LVIS V1基准进行的广泛实验表明了我们提出的方法的有效性。通过端到端培训管道,EF​​L在整体AP方面实现了29.2%,并对稀有类别进行了显着的性能改进,超越了所有现有的最先进的方法。代码可在https://github.com/modeltc/eod上获得。
translated by 谷歌翻译
基于深度学习的模型占主导地位的生产推荐系统的当前景观。此外,近年来目睹了模型规模的指数增长 - 从谷歌的2016年模型,最新的Facebook的型号有10亿个参数,具有12万亿参数。型号容量的每次跳跃都有显着的质量增强,这使我们相信100万亿参数的时代即将来临。然而,即使在工业规模数据中心内,这些模型的培训也在挑战。这种困难是从训练计算的惊人的异质性继承 - 模型的嵌入层可以包括总模型尺寸的99.99%,这是极其内存密集的;虽然其余的神经网络越来越多地计算密集型。为支持培训此类巨大模式,迫切需要有效的分布式培训系统。在本文中,我们通过仔细共同设计优化算法和分布式系统架构来解决这一挑战。具体而言,为了确保培训效率和训练精度,我们设计一种新型混合训练算法,其中嵌入层和密集的神经网络由不同的同步机制处理;然后,我们构建一个名为Persia的系统(短暂的并行推荐培训系统,其中包含混合加速),以支持这种混合培训算法。理论上的示范和实证研究均达到100万亿参数,以证明了波斯的系统设计和实施。我们将Pensia公开使用(在https://github.com/persiamml/persia),以便任何人都能够以100万亿参数的规模轻松培训推荐模型。
translated by 谷歌翻译
Object recognition techniques using convolutional neural networks (CNN) have achieved great success. However, state-of-the-art object detection methods still perform poorly on large vocabulary and long-tailed datasets, e.g. LVIS.In this work, we analyze this problem from a novel perspective: each positive sample of one category can be seen as a negative sample for other categories, making the tail categories receive more discouraging gradients. Based on it, we propose a simple but effective loss, named equalization loss, to tackle the problem of long-tailed rare categories by simply ignoring those gradients for rare categories. The equalization loss protects the learning of rare categories from being at a disadvantage during the network parameter updating. Thus the model is capable of learning better discriminative features for objects of rare classes. Without any bells and whistles, our method achieves AP gains of 4.1% and 4.8% for the rare and common categories on the challenging LVIS benchmark, compared to the Mask R-CNN baseline. With the utilization of the effective equalization loss, we finally won the 1st place in the LVIS Challenge 2019. Code has been made available at: https: //github.com/tztztztztz/eql.detectron2
translated by 谷歌翻译
Weakly-supervised object localization aims to indicate the category as well as the scope of an object in an image given only the image-level labels. Most of the existing works are based on Class Activation Mapping (CAM) and endeavor to enlarge the discriminative area inside the activation map to perceive the whole object, yet ignore the co-occurrence confounder of the object and context (e.g., fish and water), which makes the model inspection hard to distinguish object boundaries. Besides, the use of CAM also brings a dilemma problem that the classification and localization always suffer from a performance gap and can not reach their highest accuracy simultaneously. In this paper, we propose a casual knowledge distillation method, dubbed KD-CI-CAM, to address these two under-explored issues in one go. More specifically, we tackle the co-occurrence context confounder problem via causal intervention (CI), which explores the causalities among image features, contexts, and categories to eliminate the biased object-context entanglement in the class activation maps. Based on the de-biased object feature, we additionally propose a multi-teacher causal distillation framework to balance the absorption of classification knowledge and localization knowledge during model training. Extensive experiments on several benchmarks demonstrate the effectiveness of KD-CI-CAM in learning clear object boundaries from confounding contexts and addressing the dilemma problem between classification and localization performance.
translated by 谷歌翻译
Knowledge graph embedding (KGE), which maps entities and relations in a knowledge graph into continuous vector spaces, has achieved great success in predicting missing links in knowledge graphs. However, knowledge graphs often contain incomplete triples that are difficult to inductively infer by KGEs. To address this challenge, we resort to analogical inference and propose a novel and general self-supervised framework AnKGE to enhance KGE models with analogical inference capability. We propose an analogical object retriever that retrieves appropriate analogical objects from entity-level, relation-level, and triple-level. And in AnKGE, we train an analogy function for each level of analogical inference with the original element embedding from a well-trained KGE model as input, which outputs the analogical object embedding. In order to combine inductive inference capability from the original KGE model and analogical inference capability enhanced by AnKGE, we interpolate the analogy score with the base model score and introduce the adaptive weights in the score function for prediction. Through extensive experiments on FB15k-237 and WN18RR datasets, we show that AnKGE achieves competitive results on link prediction task and well performs analogical inference.
translated by 谷歌翻译
When robots learn reward functions using high capacity models that take raw state directly as input, they need to both learn a representation for what matters in the task -- the task ``features" -- as well as how to combine these features into a single objective. If they try to do both at once from input designed to teach the full reward function, it is easy to end up with a representation that contains spurious correlations in the data, which fails to generalize to new settings. Instead, our ultimate goal is to enable robots to identify and isolate the causal features that people actually care about and use when they represent states and behavior. Our idea is that we can tune into this representation by asking users what behaviors they consider similar: behaviors will be similar if the features that matter are similar, even if low-level behavior is different; conversely, behaviors will be different if even one of the features that matter differs. This, in turn, is what enables the robot to disambiguate between what needs to go into the representation versus what is spurious, as well as what aspects of behavior can be compressed together versus not. The notion of learning representations based on similarity has a nice parallel in contrastive learning, a self-supervised representation learning technique that maps visually similar data points to similar embeddings, where similarity is defined by a designer through data augmentation heuristics. By contrast, in order to learn the representations that people use, so we can learn their preferences and objectives, we use their definition of similarity. In simulation as well as in a user study, we show that learning through such similarity queries leads to representations that, while far from perfect, are indeed more generalizable than self-supervised and task-input alternatives.
translated by 谷歌翻译