智能论文笔记

Shuffle Instances-based Vision Transformer for Pancreatic Cancer ROSE Image Classification

Tianyi Zhang , Youdan Feng , Yunlu Feng , Yu Zhao , Yanli Lei , Nan Ying , Zhiling Yan , Yufang He , Guanglei Zhang

分类：计算机视觉

2022-08-14

快速的现场评估（ROSE）技术可以通过适当地分析快速染色的细胞病理学图像来显着加速胰腺癌的诊断。计算机辅助诊断（CAD）可以潜在地解决玫瑰病中病理学家的短缺。但是，不同样品之间的癌性模式差异很大，这使CAD任务极具挑战性。此外，由于不同的染色质量和各种采集装置类型，玫瑰图像在颜色分布，亮度和对比度方面具有复杂的扰动。为了应对这些挑战，我们提出了一种基于随机实例的视觉变压器（SI-VIT）方法，该方法可以减少扰动并增强实例之间的建模。借助重新组装的洗牌实例及其行李级软标签，该方法利用回归头将模型集中在细胞上，而不是各种扰动。同时，该模型与分类头结合在一起，可以有效地识别不同实例之间的一般分布模式。结果表明，分类准确性有了更准确的注意区域的显着提高，表明玫瑰图像的多种模式有效地提取了，并且复杂的扰动大大降低。这也表明SI-VIT在分析细胞病理学图像方面具有巨大的潜力。代码和实验结果可在https://github.com/sagizty/mil-si上获得。

translated by 谷歌翻译

MSHT: Multi-stage Hybrid Transformer for the ROSE Image Analysis of Pancreatic Cancer

Tianyi Zhang , Yunlu Feng , Yu Zhao , Guangda Fan , Aiming Yang , Shangqin Lyu , Peng Zhang , Fan Song , Chenbin Ma , Yangyang Sun

分类：计算机视觉 | 机器学习

2021-12-27

胰腺癌是世界上最严重恶性的癌症之一，这种癌症迅速迅速，具有很高的死亡率。快速的现场评估（玫瑰）技术通过立即分析与现场病理学家的快速染色的细胞影析学形象来创新工作流程，这使得在这种紧压的过程中能够更快的诊断。然而，由于缺乏经验丰富的病理学家，玫瑰诊断的更广泛的扩张已经受到阻碍。为了克服这个问题，我们提出了一个混合高性能深度学习模型，以实现自动化工作流程，从而释放占据病理学家的宝贵时间。通过使用我们特定的多级混合设计将变压器块引入该字段，由卷积神经网络（CNN）产生的空间特征显着增强了变压器全球建模。转向多级空间特征作为全球关注指导，这种设计将鲁棒性与CNN的感应偏差与变压器的复杂全球建模功能相结合。收集4240朵Rose图像的数据集以评估此未开发领域的方法。所提出的多级混合变压器（MSHT）在分类精度下实现95.68％，其鲜明地高于最先进的模型。面对对可解释性的需求，MSHT以更准确的关注区域表达其对应物。结果表明，MSHT可以以前所未有的图像规模精确地区分癌症样本，奠定了部署自动决策系统的基础，并在临床实践中扩大玫瑰。代码和记录可在：https://github.com/sagizty/multi-stage-ybrid-transformer。

translated by 谷歌翻译

Specificity-Preserving Federated Learning for MR Image Reconstruction

Chun-Mei Feng , Yunlu Yan , Huazhu Fu , Yong Xu , Ling Shao

分类：计算机视觉

2021-12-09

联合学习（FL）可用于通过使多个机构协作，改善磁共振（MR）图像重建的数据隐私和效率，而无需聚合本地数据。然而，由不同MR成像协议引起的域移位可以显着降低FL模型的性能。最近的流程倾向于通过增强全局模型的概括来解决这一点，但它们忽略了特定于域的特征，这可能包含有关设备属性的重要信息，并且对本地重建有用。在本文中，我们提出了一种针对MR图像重建（FEDMRI）的特异性保存流算法。核心思想是将MR重建模型划分为两个部分：全局共享编码器，以在全局级别获取概括的表示，以及客户特定的解码器，以保留每个客户端的特定于域的属性，这对于协作很重要当客户具有独特的分发时重建。此外，为了进一步提高全局共享编码器的收敛，当存在域移位时，引入加权对比正规化以在优化期间直接校正客户端和服务器之间的任何偏差。广泛的实验表明，我们的Fedmri的重建结果是最接近多机构数据的地面真理，并且它优于最先进的FL方法。

translated by 谷歌翻译

Exploring Separable Attention for Multi-Contrast MR Image Super-Resolution

Chun-Mei Feng , Yunlu Yan , Kai Yu , Yong Xu , Ling Shao , Huazhu Fu

分类：计算机视觉

2021-09-03

在相应的辅助对比的指导下，目标对比度的超级分辨磁共振（MR）图像（提供了其他解剖信息）是快速MR成像的新解决方案。但是，当前的多对比超分辨率（SR）方法倾向于直接连接不同的对比度，从而忽略了它们在不同的线索中的关系，例如在高强度和低强度区域中。在这项研究中，我们提出了一个可分离的注意网络（包括高强度的优先注意力和低强度分离注意力），名为SANET。我们的卫生网可以借助辅助对比度探索“正向”和“反向”方向中高强度和低强度区域的区域，同时学习目标对比MR的SR的更清晰的解剖结构和边缘信息图片。 SANET提供了三个吸引人的好处：（1）这是第一个探索可分离的注意机制的模型，该机制使用辅助对比来预测高强度和低强度区域，将更多的注意力转移到精炼这些区域和这些区域之间的任何不确定细节和纠正重建结果中的细小区域。（2）提出了一个多阶段集成模块，以学习多个阶段的多对比度融合的响应，获得融合表示之间的依赖性，并提高其表示能力。（3）在FastMRI和Clinical \ textit {in Vivo}数据集上进行了各种最先进的多对比度SR方法的广泛实验，证明了我们模型的优势。

translated by 谷歌翻译

Task Transformer Network for Joint MRI Reconstruction and Super-Resolution

Chun-Mei Feng , Yunlu Yan , Huazhu Fu , Li Chen , Yong Xu

分类：计算机视觉

2021-06-12

磁共振成像（MRI）的核心问题是加速度和图像质量之间的折衷。图像重建和超分辨率是磁共振成像（MRI）中的两个重要技术。目前的方法旨在单独执行这些任务，忽略它们之间的相关性。在这项工作中，我们为联合MRI重建和超分辨率提出了一个端到端的任务变压器网络（T $ ^ 2 $ net），它允许在多项任务之间共享表示和特征传输以实现更高质量的，来自高度遮盖率和退化的MRI数据的无序和运动伪影的图像。我们的框架与重建和超分辨率相结合，分为两个子分支，其功能表示为查询和键。具体地，我们鼓励两个任务之间的联合特征学习，从而传输准确的任务信息。我们首先使用两个单独的CNN分支来提取特定于任务的功能。然后，任务变压器模块旨在嵌入和综合两个任务之间的相关性。实验结果表明，我们的多任务模型显着优于高级顺序方法，包括定量和定性。

translated by 谷歌翻译

Effective and Efficient Training for Sequential Recommendation Using Cumulative Cross-Entropy Loss

Fangyu Li , Shenbao Yu , Feng Zeng , Fang Yang

分类：机器学习

2023-01-03

Increasing research interests focus on sequential recommender systems, aiming to model dynamic sequence representation precisely. However, the most commonly used loss function in state-of-the-art sequential recommendation models has essential limitations. To name a few, Bayesian Personalized Ranking (BPR) loss suffers the vanishing gradient problem from numerous negative sampling and predictionbiases; Binary Cross-Entropy (BCE) loss subjects to negative sampling numbers, thereby it is likely to ignore valuable negative examples and reduce the training efficiency; Cross-Entropy (CE) loss only focuses on the last timestamp of the training sequence, which causes low utilization of sequence information and results in inferior user sequence representation. To avoid these limitations, in this paper, we propose to calculate Cumulative Cross-Entropy (CCE) loss over the sequence. CCE is simple and direct, which enjoys the virtues of painless deployment, no negative sampling, and effective and efficient training. We conduct extensive experiments on five benchmark datasets to demonstrate the effectiveness and efficiency of CCE. The results show that employing CCE loss on three state-of-the-art models GRU4Rec, SASRec, and S3-Rec can reach 125.63%, 69.90%, and 33.24% average improvement of full ranking NDCG@5, respectively. Using CCE, the performance curve of the models on the test data increases rapidly with the wall clock time, and is superior to that of other loss functions in almost the whole process of model training.

translated by 谷歌翻译

Generalizable Black-Box Adversarial Attack with Meta Learning

Fei Yin , Yong Zhang , Baoyuan Wu , Yan Feng , Jingyi Zhang , Yanbo Fan , Yujiu Yang

分类：机器学习 | 计算机视觉

2023-01-01

In the scenario of black-box adversarial attack, the target model's parameters are unknown, and the attacker aims to find a successful adversarial perturbation based on query feedback under a query budget. Due to the limited feedback information, existing query-based black-box attack methods often require many queries for attacking each benign example. To reduce query cost, we propose to utilize the feedback information across historical attacks, dubbed example-level adversarial transferability. Specifically, by treating the attack on each benign example as one task, we develop a meta-learning framework by training a meta-generator to produce perturbations conditioned on benign examples. When attacking a new benign example, the meta generator can be quickly fine-tuned based on the feedback information of the new task as well as a few historical attacks to produce effective perturbations. Moreover, since the meta-train procedure consumes many queries to learn a generalizable generator, we utilize model-level adversarial transferability to train the meta-generator on a white-box surrogate model, then transfer it to help the attack against the target model. The proposed framework with the two types of adversarial transferability can be naturally combined with any off-the-shelf query-based attack methods to boost their performance, which is verified by extensive experiments.

translated by 谷歌翻译

Spatiotemporal implicit neural representation for unsupervised dynamic MRI reconstruction

Jie Feng , Ruimin Feng , Qing Wu , Zhiyong Zhang , Yuyao Zhang , Hongjiang Wei

分类：计算机视觉

2022-12-31

Supervised Deep-Learning (DL)-based reconstruction algorithms have shown state-of-the-art results for highly-undersampled dynamic Magnetic Resonance Imaging (MRI) reconstruction. However, the requirement of excessive high-quality ground-truth data hinders their applications due to the generalization problem. Recently, Implicit Neural Representation (INR) has appeared as a powerful DL-based tool for solving the inverse problem by characterizing the attributes of a signal as a continuous function of corresponding coordinates in an unsupervised manner. In this work, we proposed an INR-based method to improve dynamic MRI reconstruction from highly undersampled k-space data, which only takes spatiotemporal coordinates as inputs. Specifically, the proposed INR represents the dynamic MRI images as an implicit function and encodes them into neural networks. The weights of the network are learned from sparsely-acquired (k, t)-space data itself only, without external training datasets or prior images. Benefiting from the strong implicit continuity regularization of INR together with explicit regularization for low-rankness and sparsity, our proposed method outperforms the compared scan-specific methods at various acceleration factors. E.g., experiments on retrospective cardiac cine datasets show an improvement of 5.5 ~ 7.1 dB in PSNR for extremely high accelerations (up to 41.6-fold). The high-quality and inner continuity of the images provided by INR has great potential to further improve the spatiotemporal resolution of dynamic MRI, without the need of any training data.

translated by 谷歌翻译

Memory Augmented Lookup Dictionary based Language Modeling for Automatic Speech Recognition

Yukun Feng , Ming Tu , Rui Xia , Chuanzeng Huang , Yuxuan Wang

分类：自然语言处理

2022-12-30

Recent studies have shown that using an external Language Model (LM) benefits the end-to-end Automatic Speech Recognition (ASR). However, predicting tokens that appear less frequently in the training set is still quite challenging. The long-tail prediction problems have been widely studied in many applications, but only been addressed by a few studies for ASR and LMs. In this paper, we propose a new memory augmented lookup dictionary based Transformer architecture for LM. The newly introduced lookup dictionary incorporates rich contextual information in training set, which is vital to correctly predict long-tail tokens. With intensive experiments on Chinese and English data sets, our proposed method is proved to outperform the baseline Transformer LM by a great margin on both word/character error rate and tail tokens error rate. This is achieved without impact on the decoding efficiency. Overall, we demonstrate the effectiveness of our proposed method in boosting the ASR decoding performance, especially for long-tail tokens.

translated by 谷歌翻译

Learning Implicit Functions for Dense 3D Shape Correspondence of Generic Objects

Feng Liu , Xiaoming Liu

分类：计算机视觉

2022-12-29

The objective of this paper is to learn dense 3D shape correspondence for topology-varying generic objects in an unsupervised manner. Conventional implicit functions estimate the occupancy of a 3D point given a shape latent code. Instead, our novel implicit function produces a probabilistic embedding to represent each 3D point in a part embedding space. Assuming the corresponding points are similar in the embedding space, we implement dense correspondence through an inverse function mapping from the part embedding vector to a corresponded 3D point. Both functions are jointly learned with several effective and uncertainty-aware loss functions to realize our assumption, together with the encoder generating the shape latent code. During inference, if a user selects an arbitrary point on the source shape, our algorithm can automatically generate a confidence score indicating whether there is a correspondence on the target shape, as well as the corresponding semantic point if there is one. Such a mechanism inherently benefits man-made objects with different part constitutions. The effectiveness of our approach is demonstrated through unsupervised 3D semantic correspondence and shape segmentation.

translated by 谷歌翻译