智能论文笔记

Swin MAE: Masked Autoencoders for Small Datasets

Zi'an Xu , Yin Dai , Fayu Liu , Weibing Chen , Yue Liu , Lifu Shi , Sheng Liu , Yuhang Zhou

分类：计算机视觉 | 人工智能

2022-12-28

The development of deep learning models in medical image analysis is majorly limited by the lack of large-sized and well-annotated datasets. Unsupervised learning does not require labels and is more suitable for solving medical image analysis problems. However, most of the current unsupervised learning methods need to be applied to large datasets. To make unsupervised learning applicable to small datasets, we proposed Swin MAE, which is a masked autoencoder with Swin Transformer as its backbone. Even on a dataset of only a few thousand medical images and without using any pre-trained models, Swin MAE is still able to learn useful semantic features purely from images. It can equal or even slightly outperform the supervised model obtained by Swin Transformer trained on ImageNet in terms of the transfer learning results of downstream tasks. The code will be publicly available soon.

translated by 谷歌翻译

Parotid Gland MR Image Segmentation Based on Contrastive Learning

Zi'an Xu , Yin Dai , Fayu Liu , Boyuan Wu , Weibing Chen , Lifu Shi

分类：计算机视觉 | 人工智能

2022-08-26

与自然图像相比，医学图像很难获取，标签成本很高。作为一种无监督的学习方法，对比学习可以更有效地利用未标记的医学图像。在本文中，我们使用了一种基于变压器的对比学习方法，并通过转移学习创新了对比度学习网络。然后，将输出模型转移到下游腮腺分割任务，该任务改善了测试集上腮腺分割模型的性能。改善的DSC为89.60％，MPA为99.36％，MIOU为85.11％，HD为2.98。与使用监督学习模型作为腮腺分割网络的预训练模型的结果相比，所有四个指标均显示出显着改善。此外，我们发现，通过对比度学习模型对细分网络的改进主要在编码器部分中，因此本文还试图为解码器部分构建对比度学习网络，并讨论了在构建过程中遇到的问题。

translated by 谷歌翻译

Parotid Gland MRI Segmentation Based on Swin-Unet and Multimodal Images

Yin Dai , Zi'an Xu , Fayu Liu , Siqi Li , Sheng Liu , Lifu Shi , Jun Fu

分类：计算机视觉 | 机器学习

2022-06-07

腮腺肿瘤约占头颈肿瘤的2％至10％。术前肿瘤定位，鉴别诊断以及随后选择适当的腮腺肿瘤治疗方法。然而，这些肿瘤的相对稀有性和高度分散的组织类型使基于术前放射线学对这种肿瘤病变的细微差异诊断造成了未满足的需求。最近，深度学习方法发展迅速，尤其是变形金刚在计算机视觉中击败了传统的卷积神经网络。为计算机视觉任务提出了许多新的基于变压器的网络。在这项研究中，收集了多中心多模束MRI图像。使用了基于变压器的SWIN-UNET。将搅拌，T1和T2模态的MRI图像合并为三通道数据以训练网络。我们实现了对腮腺和肿瘤感兴趣区域的分割。测试集上的模型DSC为88.63％，MPA为99.31％，MIOU为83.99％，HD为3.04。然后在本文中设计了一系列比较实验，以进一步验证算法的分割性能。

translated by 谷歌翻译

Retire: Robust Expectile Regression in High Dimensions

Rebeka Man , Kean Ming Tan , Zian Wang , Wen-Xin Zhou

分类： (统计)机器学习

2022-12-11

High-dimensional data can often display heterogeneity due to heteroscedastic variance or inhomogeneous covariate effects. Penalized quantile and expectile regression methods offer useful tools to detect heteroscedasticity in high-dimensional data. The former is computationally challenging due to the non-smooth nature of the check loss, and the latter is sensitive to heavy-tailed error distributions. In this paper, we propose and study (penalized) robust expectile regression (retire), with a focus on iteratively reweighted $\ell_1$-penalization which reduces the estimation bias from $\ell_1$-penalization and leads to oracle properties. Theoretically, we establish the statistical properties of the retire estimator under two regimes: (i) low-dimensional regime in which $d \ll n$; (ii) high-dimensional regime in which $s\ll n\ll d$ with $s$ denoting the number of significant predictors. In the high-dimensional setting, we carefully characterize the solution path of the iteratively reweighted $\ell_1$-penalized retire estimation, adapted from the local linear approximation algorithm for folded-concave regularization. Under a mild minimum signal strength condition, we show that after as many as $\log(\log d)$ iterations the final iterate enjoys the oracle convergence rate. At each iteration, the weighted $\ell_1$-penalized convex program can be efficiently solved by a semismooth Newton coordinate descent algorithm. Numerical studies demonstrate the competitive performance of the proposed procedure compared with either non-robust or quantile regression based alternatives.

translated by 谷歌翻译

Shoupa: An AI System for Early Diagnosis of Parkinson's Disease

Jingwei Li , Ruitian Wu , Tzu-liang Huang , Zian Pan , Ming-chun Huang

分类：人工智能

2022-11-28

Parkinson's Disease (PD) is a progressive nervous system disorder that has affected more than 5.8 million people, especially the elderly. Due to the complexity of its symptoms and its similarity to other neurological disorders, early detection requires neurologists or PD specialists to be involved, which is not accessible to most old people. Therefore, we integrate smart mobile devices with AI technologies. In this paper, we introduce the framework of our developed PD early detection system which combines different tasks evaluating both motor and non-motor symptoms. With the developed model, we help users detect PD punctually in non-clinical settings and figure out their most severe symptoms. The results are expected to be further used for PD rehabilitation guidance and detection of other neurological disorders.

translated by 谷歌翻译

GET3D: A Generative Model of High Quality 3D Textured Shapes Learned from Images

Jun Gao , Tianchang Shen , Zian Wang , Wenzheng Chen , Kangxue Yin , Daiqing Li , Or Litany , Zan Gojcic , Sanja Fidler

分类：计算机视觉

2022-09-22

随着几个行业正在朝着建模大规模的3D虚拟世界迈进，因此需要根据3D内容的数量，质量和多样性来扩展的内容创建工具的需求变得显而易见。在我们的工作中，我们旨在训练Parterant 3D生成模型，以合成纹理网格，可以通过3D渲染引擎直接消耗，因此立即在下游应用中使用。 3D生成建模的先前工作要么缺少几何细节，因此在它们可以生成的网格拓扑中受到限制，通常不支持纹理，或者在合成过程中使用神经渲染器，这使得它们在常见的3D软件中使用。在这项工作中，我们介绍了GET3D，这是一种生成模型，该模型直接生成具有复杂拓扑，丰富几何细节和高保真纹理的显式纹理3D网格。我们在可区分的表面建模，可区分渲染以及2D生成对抗网络中桥接了最新成功，以从2D图像集合中训练我们的模型。 GET3D能够生成高质量的3D纹理网格，从汽车，椅子，动物，摩托车和人类角色到建筑物，对以前的方法进行了重大改进。

translated by 谷歌翻译

Neural Light Field Estimation for Street Scenes with Differentiable Virtual Object Insertion

Zian Wang , Wenzheng Chen , David Acuna , Jan Kautz , Sanja Fidler

分类：计算机视觉

2022-08-19

我们考虑了户外照明估算的挑战性问题，即影像逼真的虚拟对象将其插入照片中的目标。现有在室外照明估计的作品通常将场景照明简化为环境图，该图无法捕获室外场景中的空间变化的照明效果。在这项工作中，我们提出了一种神经方法，该方法可以从单个图像中估算5D HDR光场，以及一个可区分的对象插入公式，该公式可以通过基于图像的损失来端对端训练，从而鼓励现实主义。具体而言，我们设计了针对室外场景量身定制的混合照明表示，其中包含一个HDR Sky Dome，可处理太阳的极端强度，并具有体积的照明表示，该代表模拟了周围场景的空间变化外观。通过估计的照明，我们的阴影感知对象插入是完全可区分的，这使得对复合图像的对抗训练可以为照明预测提供其他监督信号。我们在实验上证明，混合照明表示比现有的室外照明估计方法更具性能。我们进一步显示了AR对象插入在自主驾驶应用程序中的好处，在对我们的增强数据进行培训时，我们可以在其中获得3D对象检测器的性能提高。

translated by 谷歌翻译

Learning Intrinsic Images for Clothing

Kuo Jiang , Zian Wang , Xiaodong Yang

分类：计算机视觉

2021-11-16

人类服装的重建是一项重要任务，往往依赖于内在的图像分解。通过缺乏域特定的数据和粗略评估度量，现有模型无法生成可满足图形应用的结果。在本文中，我们专注于服装图像的内在图像分解并具有全面的改进。我们收集了挑剔的衣物内在图像数据集，包括合成训练集和现实世界测试集。更可解释的边缘感知度量标准和注释方案是为测试集设计的，这允许对内部模型进行诊断评估。最后，我们提出了用精心设计的损失术语和对抗模块的布料模型。它利用易于获取的标签来学习现实世界的阴影，显着提高性能，只有轻微的额外注释工作。我们表明，我们提出的模型显着减少了纹理复制的伪像，同时保持了令人惊讶的微小细节，优于现有的现有方法。

translated by 谷歌翻译

Reference Twice: A Simple and Unified Baseline for Few-Shot Instance Segmentation

Yue Han , Jiangning Zhang , Zhucun Xue , Chao Xu , Xintian Shen , Yabiao Wang , Chengjie Wang , Yong Liu , Xiangtai Li

分类：计算机视觉

2023-01-03

Few Shot Instance Segmentation (FSIS) requires models to detect and segment novel classes with limited several support examples. In this work, we explore a simple yet unified solution for FSIS as well as its incremental variants, and introduce a new framework named Reference Twice (RefT) to fully explore the relationship between support/query features based on a Transformer-like framework. Our key insights are two folds: Firstly, with the aid of support masks, we can generate dynamic class centers more appropriately to re-weight query features. Secondly, we find that support object queries have already encoded key factors after base training. In this way, the query features can be enhanced twice from two aspects, i.e., feature-level and instance-level. In particular, we firstly design a mask-based dynamic weighting module to enhance support features and then propose to link object queries for better calibration via cross-attention. After the above steps, the novel classes can be improved significantly over our strong baseline. Additionally, our new framework can be easily extended to incremental FSIS with minor modification. When benchmarking results on the COCO dataset for FSIS, gFSIS, and iFSIS settings, our method achieves a competitive performance compared to existing approaches across different shots, e.g., we boost nAP by noticeable +8.2/+9.4 over the current state-of-the-art FSIS method for 10/30-shot. We further demonstrate the superiority of our approach on Few Shot Object Detection. Code and model will be available.

translated by 谷歌翻译

AI in HCI Design and User Experience

Wei Xu

分类：人工智能

2023-01-03

In this chapter, we review and discuss the transformation of AI technology in HCI/UX work and assess how AI technology will change how we do the work. We first discuss how AI can be used to enhance the result of user research and design evaluation. We then discuss how AI technology can be used to enhance HCI/UX design. Finally, we discuss how AI-enabled capabilities can improve UX when users interact with computing systems, applications, and services.

translated by 谷歌翻译