智能论文笔记

Reducing Domain Gap in Frequency and Spatial domain for Cross-modality Domain Adaptation on Medical Image Segmentation

Shaolei Liu , Siqi Yin , Linhao Qu , Manning Wang

分类：计算机视觉

2022-11-28

Unsupervised domain adaptation (UDA) aims to learn a model trained on source domain and performs well on unlabeled target domain. In medical image segmentation field, most existing UDA methods depend on adversarial learning to address the domain gap between different image modalities, which is ineffective due to its complicated training process. In this paper, we propose a simple yet effective UDA method based on frequency and spatial domain transfer uner multi-teacher distillation framework. In the frequency domain, we first introduce non-subsampled contourlet transform for identifying domain-invariant and domain-variant frequency components (DIFs and DVFs), and then keep the DIFs unchanged while replacing the DVFs of the source domain images with that of the target domain images to narrow the domain gap. In the spatial domain, we propose a batch momentum update-based histogram matching strategy to reduce the domain-variant image style bias. Experiments on two cross-modality medical image segmentation datasets (cardiac, abdominal) show that our proposed method achieves superior performance compared to state-of-the-art methods.

translated by 谷歌翻译

Robust Point Cloud Registration Framework Based on Deep Graph Matching(TPAMI Version)

Kexue Fu , Jiazheng Luo , Xiaoyuan Luo , Shaolei Liu , Chenxi Zhang , Manning Wang

分类：计算机视觉

2022-11-09

3D point cloud registration is a fundamental problem in computer vision and robotics. Recently, learning-based point cloud registration methods have made great progress. However, these methods are sensitive to outliers, which lead to more incorrect correspondences. In this paper, we propose a novel deep graph matching-based framework for point cloud registration. Specifically, we first transform point clouds into graphs and extract deep features for each point. Then, we develop a module based on deep graph matching to calculate a soft correspondence matrix. By using graph matching, not only the local geometry of each point but also its structure and topology in a larger range are considered in establishing correspondences, so that more correct correspondences are found. We train the network with a loss directly defined on the correspondences, and in the test stage the soft correspondences are transformed into hard one-to-one correspondences so that registration can be performed by a correspondence-based solver. Furthermore, we introduce a transformer-based method to generate edges for graph construction, which further improves the quality of the correspondences. Extensive experiments on object-level and scene-level benchmark datasets show that the proposed method achieves state-of-the-art performance. The code is available at: \href{https://github.com/fukexue/RGM}{https://github.com/fukexue/RGM}.

translated by 谷歌翻译

PointCLM: A Contrastive Learning-based Framework for Multi-instance Point Cloud Registration

Mingzhi Yuan , Zhihao Li , Qiuye Jin , Xinrong Chen , Manning Wang

分类：计算机视觉

2022-09-01

多实体点云注册是估计目标点云中源点云实例的多个姿势的问题。解决此问题是具有挑战性的，因为一个实例的嵌入对应关系构成了所有其他实例的异常值。现有方法通常依赖于耗时的假设抽样或具有利用空间一致性的特征，从而导致性能有限。在本文中，我们提出了PointClm，这是一个基于对比的学习构成点云注册的框架。我们首先利用对比度学习来学习投入推定的对应关系的完善的深层表示。然后，基于这些表示形式，我们提出了一个异常的修剪策略和聚类策略，以有效地删除异常值并将其余对应关系分配给正确实例。我们的方法的表现优于合成数据集和真实数据集的最新方法。

translated by 谷歌翻译

HTML版本

Towards Label-efficient Automatic Diagnosis and Analysis: A Comprehensive Survey of Advanced Deep Learning-based Weakly-supervised, Semi-supervised and Self-supervised Techniques in Histopathological Image Analysis

Linhao Qu , Siyu Liu , Xiaoyu Liu , Manning Wang , Zhijian Song

分类：计算机视觉

2022-08-18

组织病理学图像包含丰富的表型信息和病理模式，这是疾病诊断的黄金标准，对于预测患者预后和治疗结果至关重要。近年来，在临床实践中迫切需要针对组织病理学图像的计算机自动化分析技术，而卷积神经网络代表的深度学习方法已逐渐成为数字病理领域的主流。但是，在该领域获得大量细粒的注释数据是一项非常昂贵且艰巨的任务，这阻碍了基于大量注释数据的传统监督算法的进一步开发。最新的研究开始从传统的监督范式中解放出来，最有代表性的研究是基于弱注释，基于有限的注释的半监督学习范式以及基于自我监督的学习范式的弱监督学习范式的研究图像表示学习。这些新方法引发了针对注释效率的新自动病理图像诊断和分析。通过对130篇论文的调查，我们对从技术和方法论的角度来看，对计算病理学领域中有关弱监督学习，半监督学习以及自我监督学习的最新研究进行了全面的系统综述。最后，我们提出了这些技术的关键挑战和未来趋势。

translated by 谷歌翻译

Point-McBert: A Multi-choice Self-supervised Framework for Point Cloud Pre-training

Kexue Fu , Mingzhi Yuan , Manning Wang

分类：计算机视觉

2022-07-27

蒙面语言建模（MLM）已成为最成功的自我保护的预训练任务之一。受其成功的启发，Point-Bert作为Point Cloud的先驱工作，提出了蒙版点建模（MPM），以便在大规模无动物数据集上预先训练点变压器。尽管表现出色，但我们发现语言和点云之间的固有区别倾向于引起点云的模棱两可的令牌化。对于点云，没有用于点云令牌化的黄金标准。尽管Point-Bert引入了离散的变异自动编码器（DVAE）作为令牌，以将令牌ID分配给本地补丁，但它倾向于为本地补丁生成模棱两可的令牌ID。我们发现，这种不完美的令牌可能会为语义相似的补丁产生不同的令牌ID，并为语义 - 差异贴片提供相同的令牌ID。为了解决上述问题，我们提出了我们的Point-Mcbert，这是一个带有缓解和精致的监督信号的预训练框架。具体而言，我们简化了对补丁的先前单选择约束，并为每个补丁作为监督提供多项选择令牌ID。此外，我们利用了Transformer学到的高级语义，以进一步完善我们的监督信号。关于点云分类，几乎没有射击分类和部分分割任务的广泛实验证明了我们方法的优势，例如，预训练的变压器在ModelNet40上实现了94.1％的精度，在ScanObjectnn和新的ScanObjectnn和新的Satactnn New State-nate Satactnn NEC中的精度为84.28％ - 几次学习的表现。我们还证明，我们的方法不仅可以提高所有下游任务上的点 - 伯特的性能，而且几乎没有额外的计算开销。

translated by 谷歌翻译

DGMIL: Distribution Guided Multiple Instance Learning for Whole Slide Image Classification

Linhao Qu , Xiaoyuan Luo , Shaolei Liu , Manning Wang , Zhijian Song

分类：计算机视觉

2022-06-17

多个实例学习（MIL）广泛用于分析组织病理学全幻灯片图像（WSIS）。但是，现有的MIL方法不会明确地对数据分配进行建模，而仅通过训练分类器来歧视行李级或实例级决策边界。在本文中，我们提出了DGMIL：一个特征分布引导为WSI分类和阳性贴剂定位的深度MIL框架。我们没有设计复杂的判别网络体系结构，而是揭示组织病理学图像数据的固有特征分布可以作为分类的非常有效的指南。我们提出了一种集群条件的特征分布建模方法和基于伪标签的迭代特征空间改进策略，以便在最终特征空间中，正面和负面实例可以轻松分离。 CamelyOn16数据集和TCGA肺癌数据集的实验表明，我们的方法为全球分类和阳性贴剂定位任务提供了新的SOTA。

translated by 谷歌翻译

Beyond the Imitation Game: Quantifying and extrapolating the capabilities of language models

Aarohi Srivastava , Abhinav Rastogi , Abhishek Rao , Abu Awal Md Shoeb , Abubakar Abid , Adam Fisch , Adam R. Brown , Adam Santoro , Aditya Gupta , Adrià Garriga-Alonso

分类：自然语言处理 | 人工智能 | 机器学习 | (统计)机器学习

2022-06-09

语言模型既展示了定量的改进，又展示了新的定性功能，随着规模的增加。尽管它们具有潜在的变革性影响，但这些新能力的特征却很差。为了为未来的研究提供信息，为破坏性的新模型能力做准备，并改善社会有害的效果，至关重要的是，我们必须了解目前和近乎未来的能力和语言模型的局限性。为了应对这一挑战，我们介绍了超越模仿游戏基准（Big Bench）。 Big Bench目前由204个任务组成，由132家机构的442位作者贡献。任务主题是多样的，从语言学，儿童发展，数学，常识性推理，生物学，物理学，社会偏见，软件开发等等。 Big-Bench专注于被认为超出当前语言模型的功能的任务。我们评估了OpenAI的GPT型号，Google内部密集变压器体系结构和大型基础上的开关稀疏变压器的行为，跨越了数百万到数十亿个参数。此外，一个人类专家评估者团队执行了所有任务，以提供强大的基准。研究结果包括：模型性能和校准都随规模改善，但绝对的术语（以及与评估者的性能相比）；在模型类中的性能非常相似，尽管带有稀疏性。逐渐和预测的任务通常涉及大量知识或记忆成分，而在临界规模上表现出“突破性”行为的任务通常涉及多个步骤或组成部分或脆性指标；社交偏见通常会随着含糊不清的环境而随着规模而增加，但这可以通过提示来改善。

translated by 谷歌翻译

TransMEF: A Transformer-Based Multi-Exposure Image Fusion Framework using Self-Supervised Multi-Task Learning

Linhao Qu , Shaolei Liu , Manning Wang , Zhijian Song

分类：计算机视觉

2021-12-02

在本文中，我们提出了一种使用自我监督的多任务学习的基于变换器的多曝光图像融合框架的传输。该框架基于编码器解码器网络，可以在大型自然图像数据集上培训，并且不需要地面真理融合图像。我们根据多曝光图像的特点设计三个自我监督的重建任务，并使用多任务学习同时进行这些任务;通过该过程，网络可以学习多曝光图像的特征并提取更多的广义特征。此外，为了补偿在基于CNN的架构中建立远程依赖性的缺陷，我们设计了一个与变压器模块相结合的编码器。这种组合使网络能够专注于本地和全局信息。我们评估了我们的方法，并将其与最新释放的多曝光图像融合基准数据集进行了11个基于竞争的传统和深入学习的方法，我们的方法在主观和客观评估中实现了最佳性能。

translated by 谷歌翻译

On the Opportunities and Risks of Foundation Models

Rishi Bommasani , Drew A. Hudson , Ehsan Adeli , Russ Altman , Simran Arora , Sydney von Arx , Michael S. Bernstein , Jeannette Bohg , Antoine Bosselut , Emma Brunskill

分类：机器学习 | 人工智能

2021-08-16

AI正在经历范式转变，随着模型的兴起（例如Bert，Dall-E，GPT-3），这些模型经过大规模的数据训练，并且可以适应广泛的下游任务。我们称这些模型基础模型来强调其至关重要但不完整的特征。该报告提供了基础模型的机会和风险的详尽说明，包括其功能（例如语言，愿景，机器人技术，推理，人类互动）和技术原则（例如，模型架构，培训程序，数据，系统，安全，安全性，评估，理论）对其应用（例如法律，医疗保健，教育）和社会影响（例如不平等，滥用，经济和环境影响，法律和道德考虑）。尽管基础模型基于标准的深度学习和转移学习，但它们的规模导致了新的新兴能力，以及它们在许多任务中的有效性都激发了同质化。同质化提供了强大的杠杆作用，但要求谨慎，因为基础模型的缺陷均由下游的所有适应模型继承。尽管即将广泛地部署基础模型，但我们目前对它们的工作方式，失败以及由于其新兴属性的影响而缺乏清晰的了解。为了解决这些问题，我们认为基础模型的许多批判性研究都需要与他们的基本社会技术性质相称。

translated by 谷歌翻译

Localization & Mapping Requirements for Level 2+ Autonomous Vehicles

Tyler G. R. Reid , Andrew Neish , Brian Manning

分类：机器人

2022-12-16

Autonomous vehicles are being deployed with a spectrum of capability, extending from driver assistance features for the highway in personal vehicles (SAE Level 2+) to fully autonomous fleet ride sharing services operating in complex city environments (SAE Level 4+). This spectrum of autonomy often operates in different physical environments with different degrees of assumed driver in-the-loop oversight and hence have very different system and subsystem requirements. At the heart of SAE Level 2 to 5 systems is localization and mapping, which ranges from road determination for feature geofencing or high-level routing, through lane determination for advanced driver assistance, to where-in-lane positioning for full vehicle control. We assess localization and mapping requirements for different levels of autonomy and supported features. This work provides a framework for system decomposition, including the level of redundancy needed to achieve the target level of safety. We examine several representative autonomous and assistance features and make recommendations on positioning requirements as well map georeferencing and information integrity.

translated by 谷歌翻译