智能论文笔记

PolyU-BPCoMa: A Dataset and Benchmark Towards Mobile Colorized Mapping Using a Backpack Multisensorial System

Wenzhong Shi , Pengxin Chen , Muyang Wang , Sheng Bao , Haodong Xiang , Yue Yu , Daping Yang

分类：计算机视觉

2022-06-15

通过移动激光扫描和图像构建有色点的云是测量和映射的基本工作。它也是为智能城市建造数字双胞胎的重要先决条件。但是，现有的公共数据集要么是相对较小的规模，要么缺乏准确的几何和彩色地面真理。本文记录了一个名为Polyu-BPComa的多功能数据集，该数据集可独特地定位于移动着色映射。该数据集在背包平台上包含3D激光雷达，球形成像，GNSS和IMU的资源。颜色检查器板在每个调查区域粘贴，因为目标和地面真相数据是由先进的陆地激光扫描仪（TLS）收集的。 3D几何信息和颜色信息可以分别在背包系统和TLS产生的有色点云中恢复。因此，我们提供了一个机会，可以同时为移动多感官系统对映射和着色精度进行基准测试。该数据集的尺寸约为800 GB，涵盖室内和室外环境。数据集和开发套件可在https://github.com/chenpengxin/polyu-bpcoma.git上找到。

translated by 谷歌翻译

Deep Fair Clustering via Maximizing and Minimizing Mutual Information

Pengxin Zeng , Yunfan Li , Peng Hu , Dezhong Peng , Jiancheng Lv , Xi Peng

分类：机器学习

2022-09-26

公平的聚类旨在将数据分为不同的簇，同时防止敏感属性（例如性别，种族，RNA测序技术），而不是主导聚类。尽管最近已经进行了许多作品并取得了巨大的成功，但其中大多数是启发式的，并且缺乏算法设计的统一理论。在这项工作中，我们通过开发一种相互信息理论来填补这一空白，以实现深度公平的聚类，并因此设计出一种称为FCMI的新型算法。简而言之，通过最大化和最大程度地减少共同信息，FCMI旨在通过深度公平的聚类（即紧凑，平衡和公平的簇）以及信息丰富的特征来实现四种特征。除了对理论和算法的贡献外，这项工作的另一个贡献是提出了一个基于信息理论的新颖的公平聚类指标。与现有的评估指标不同，我们的指标以整体而不是单独的方式来衡量聚类的质量和公平性。为了验证拟议的FCMI的有效性，我们对六个基准进行了实验，包括单细胞RNA-seq Atlas，而与11种最先进的方法相比，就五个指标而言。认可后将发布代码。

translated by 谷歌翻译

MMMNA-Net for Overall Survival Time Prediction of Brain Tumor Patients

Wen Tang , Haoyue Zhang , Pengxin Yu , Han Kang , Rongguo Zhang

分类：计算机视觉

2022-06-13

总生存时间（OS）时间是神经胶质瘤情况最重要的评估指数之一。多模式磁共振成像（MRI）扫描在神经胶质瘤预后OS时间的研究中起重要作用。为多模式MRI问题的OS时间预测提出了几种基于学习的方法。但是，这些方法通常在深度学习网络开始或结束时融合多模式信息，并且缺乏来自不同尺度的特征。此外，网络末尾的融合始终适应全球（例如，在全球平均池输出串联后完全连接）或与局部（例如，双线性池）的融合，这会失去与全球局部的局部信息。在本文中，我们提出了一种用于对脑肿瘤患者的多模式OS时间预测的新方法，该方法包含在不同尺度上引入的改进的非局部特征融合模块。我们的方法比当前最新方法获得了相对8.76％的改善（0.6989 vs. 0.6426的精度）。广泛的测试表明，我们的方法可以适应缺失方式的情况。该代码可在https://github.com/tangwen920812/mmmna-net上找到。

translated by 谷歌翻译

RPLHR-CT Dataset and Transformer Baseline for Volumetric Super-Resolution from CT Scans

Pengxin Yu , Haoyue Zhang , Han Kang , Wen Tang , Corey W. Arnold , Rongguo Zhang

分类：计算机视觉 | 机器学习

2022-06-13

在临床实践中，由于较短的获取时间和较低的存储成本，通常使用了平面分辨率低的各向异性体积医学图像。然而，粗分辨率可能导致医生或计算机辅助诊断算法的医学诊断困难。基于深度学习的体积超分辨率（SR）方法是改善分辨率的可行方法，其核心是卷积神经网络（CNN）。尽管进展最近，但这些方法受到卷积运算符的固有属性的限制，卷积运算符忽略内容相关性，无法有效地对远程依赖性进行建模。此外，大多数现有方法都使用伪配合的体积进行训练和评估，其中伪低分辨率（LR）体积是通过简单的高分辨率（HR）对应物的简单降解而产生的。但是，伪和现实LR之间的域间隙导致这些方法在实践中的性能不佳。在本文中，我们构建了第一个公共实用数据集RPLHR-CT作为体积SR的基准，并通过重新实现四种基于CNN的最先进的方法来提供基线结果。考虑到CNN的固有缺点，我们还提出了基于注意力机制的变压器体积超分辨率网络（TVSRN），完全与卷积分配。这是首次将纯变压器用于CT体积SR的研究。实验结果表明，TVSRN在PSNR和SSIM上的所有基准都显着胜过。此外，TVSRN方法在图像质量，参数数量和运行时间之间取得了更好的权衡。数据和代码可在https://github.com/smilenaxx/rplhr-ct上找到。

translated by 谷歌翻译

Transformer Lesion Tracker

Wen Tang , Han Kang , Haoyue Zhang , Pengxin Yu , Corey W. Arnold , Rongguo Zhang

分类：计算机视觉

2022-06-13

通过纵向病变跟踪评估病变进展和治疗反应在临床实践中起着至关重要的作用。当手动进行病变匹配时，该任务的自动化方法是由劳动力成本和时间消耗的促进的。以前的方法通常缺乏本地和全球信息的集成。在这项工作中，我们提出了一种基于变压器的方法，称为变压器病变跟踪器（TLT）。具体而言，我们设计了一个基于注意力的变压器（CAT），以捕获和组合全球和本地信息以增强特征提取。我们还开发了一个基于注册的解剖注意模块（RAAM），以向CAT介绍解剖信息，以便它可以专注于有用的特征知识。提出了一种稀疏选择策略（SSS），用于选择特征和减少变压器训练中的内存足迹。此外，我们使用全球回归来进一步提高模型性能。我们在公共数据集上进行实验，以显示我们方法的优势，并发现我们的模型性能使欧几里得中心的平均误差至少提高了至少14.3％（6mm vs. 7mm），而不是先进的ART（SOTA））。代码可在https://github.com/tangwen920812/tlt上找到。

translated by 谷歌翻译

Safe Multi-Task Learning

Pengxin Guo , Feiyang Ye , Yu Zhang

分类：机器学习

2021-11-20

近年来，由于许多应用中的良好性能，多任务学习（MTL）引起了很多关注。但是，许多现有的MTL模型不能保证其性能不会比每项任务的单一任务对应物更糟糕。虽然这些现象已经被一些作品经验识别，但很少的工作旨在处理所产生的问题，这在本文中正式定义为负分享。为了实现安全的多任务学习，在没有\ texit {否定共享}的情况下，我们提出了一个安全的多任务学习（SMTL）模型，它由所有任务，私人编码器，门和私有解码器共享的公共编码器组成。具体而言，每个任务都有私人编码器，门和私有解码器，其中门是学习如何将私人编码器和公共编码器组合到下游私有解码器。为了减少推理阶段期间的存储成本，提出了一种Lite版本的SMTL，以允许大门选择公共编码器或相应的私人编码器。此外，我们提出了一种SMT1的变体来放置所有任务的解码后的所有门。几个基准数据集的实验证明了所提出的方法的有效性。

translated by 谷歌翻译

Generative appearance replay for continual unsupervised domain adaptation

Boqi Chen , Kevin Thandiackal , Pushpak Pati , Orcun Goksel

分类：计算机视觉 | 人工智能

2023-01-03

Deep learning models can achieve high accuracy when trained on large amounts of labeled data. However, real-world scenarios often involve several challenges: Training data may become available in installments, may originate from multiple different domains, and may not contain labels for training. Certain settings, for instance medical applications, often involve further restrictions that prohibit retention of previously seen data due to privacy regulations. In this work, to address such challenges, we study unsupervised segmentation in continual learning scenarios that involve domain shift. To that end, we introduce GarDA (Generative Appearance Replay for continual Domain Adaptation), a generative-replay based approach that can adapt a segmentation model sequentially to new domains with unlabeled data. In contrast to single-step unsupervised domain adaptation (UDA), continual adaptation to a sequence of domains enables leveraging and consolidation of information from multiple domains. Unlike previous approaches in incremental UDA, our method does not require access to previously seen data, making it applicable in many practical scenarios. We evaluate GarDA on two datasets with different organs and modalities, where it substantially outperforms existing techniques.

translated by 谷歌翻译

MGTAB: A Multi-Relational Graph-Based Twitter Account Detection Benchmark

Shuhao Shi , Kai Qiao , Jian Chen , Shuai Yang , Jie Yang , Baojie Song , Linyuan Wang , Bin Yan

分类：计算机视觉

2023-01-03

The development of social media user stance detection and bot detection methods rely heavily on large-scale and high-quality benchmarks. However, in addition to low annotation quality, existing benchmarks generally have incomplete user relationships, suppressing graph-based account detection research. To address these issues, we propose a Multi-Relational Graph-Based Twitter Account Detection Benchmark (MGTAB), the first standardized graph-based benchmark for account detection. To our knowledge, MGTAB was built based on the largest original data in the field, with over 1.55 million users and 130 million tweets. MGTAB contains 10,199 expert-annotated users and 7 types of relationships, ensuring high-quality annotation and diversified relations. In MGTAB, we extracted the 20 user property features with the greatest information gain and user tweet features as the user features. In addition, we performed a thorough evaluation of MGTAB and other public datasets. Our experiments found that graph-based approaches are generally more effective than feature-based approaches and perform better when introducing multiple relations. By analyzing experiment results, we identify effective approaches for account detection and provide potential future research directions in this field. Our benchmark and standardized evaluation procedures are freely available at: https://github.com/GraphDetec/MGTAB.

translated by 谷歌翻译

Explaining Imitation Learning through Frames

Boyuan Zheng , Jianlong Zhou , Chunjie Liu , Yiqiao Li , Fang Chen

分类：机器学习 | 计算机视觉

2023-01-03

As one of the prevalent methods to achieve automation systems, Imitation Learning (IL) presents a promising performance in a wide range of domains. However, despite the considerable improvement in policy performance, the corresponding research on the explainability of IL models is still limited. Inspired by the recent approaches in explainable artificial intelligence methods, we proposed a model-agnostic explaining framework for IL models called R2RISE. R2RISE aims to explain the overall policy performance with respect to the frames in demonstrations. It iteratively retrains the black-box IL model from the randomized masked demonstrations and uses the conventional evaluation outcome environment returns as the coefficient to build an importance map. We also conducted experiments to investigate three major questions concerning frames' importance equality, the effectiveness of the importance map, and connections between importance maps from different IL models. The result shows that R2RISE successfully distinguishes important frames from the demonstrations.

translated by 谷歌翻译

Saliency-Aware Spatio-Temporal Artifact Detection for Compressed Video Quality Assessment

Liqun Lin , Yang Zheng , Weiling Chen , Chengdong Lan , Tiesong Zhao

分类：计算机视觉

2023-01-03

Compressed videos often exhibit visually annoying artifacts, known as Perceivable Encoding Artifacts (PEAs), which dramatically degrade video visual quality. Subjective and objective measures capable of identifying and quantifying various types of PEAs are critical in improving visual quality. In this paper, we investigate the influence of four spatial PEAs (i.e. blurring, blocking, bleeding, and ringing) and two temporal PEAs (i.e. flickering and floating) on video quality. For spatial artifacts, we propose a visual saliency model with a low computational cost and higher consistency with human visual perception. In terms of temporal artifacts, self-attention based TimeSFormer is improved to detect temporal artifacts. Based on the six types of PEAs, a quality metric called Saliency-Aware Spatio-Temporal Artifacts Measurement (SSTAM) is proposed. Experimental results demonstrate that the proposed method outperforms state-of-the-art metrics. We believe that SSTAM will be beneficial for optimizing video coding techniques.

translated by 谷歌翻译