腹部器官分割具有许多重要的临床应用,例如器官定量,手术计划和疾病诊断。但是,从CT扫描中手动注释器官是耗时且劳动密集型的。半监督的学习表明,通过从大量未标记的图像和有限的标签样本中学习来减轻这一挑战的潜力。在这项工作中,我们遵循自我训练策略,并使用CNN和Transformer使用混合体系结构(PHTRAN),以生成精确的伪标签。之后,我们将标签数据一起介绍给具有轻量级PHTRAN的两阶段分割框架,以提高模型的性能和概括能力,同时保持效率。 Flare2022验证集的实验表明,我们的方法可实现出色的分割性能以及快速和低资源模型的推断。平均DSC和HSD分别为0.8956和0.9316。在我们的开发环境下,平均推理时间为18.62 s,平均最大GPU存储器为1995.04 MB,GPU内存时间曲线下的面积和CPU利用时间曲线下的平均面积为23196.84和319.67。
translated by 谷歌翻译
变压器在计算机视觉中的成功吸引了医学成像社区越来越多的关注。特别是对于医学图像细分,已经介绍了许多基于卷积神经网络(CNN)和变压器的出色混合体系结构,并取得了令人印象深刻的性能。但是,将模块化变压器嵌入CNN中的大多数方法都难以发挥其全部潜力。在本文中,我们提出了一种新型的医学图像分割的混合体系结构,称为Phtrans,该架构可与主要构建基块中的变形金刚和CNN杂交,以产生来自全球和本地特征的层次结构表示,并适应性地汇总它们,旨在完全利用其优势以获得更好的优势。细分性能。具体而言,phtrans遵循U形编码器编码器设计,并在深层阶段引入平行的Hybird模块,其中卷积块和经过修改的3D SWIN变压器分别学习本地特征和全局依赖性,然后统一尺寸,统一尺寸输出以实现特征聚合。超出颅库和自动化心脏诊断挑战数据集以外的多ATLA标签的广泛实验结果证实了其有效性,始终超过了最先进的方法。该代码可在以下网址获得:https://github.com/lseventeen/phtrans。
translated by 谷歌翻译
Artificial intelligence methods including deep neural networks (DNN) can provide rapid molecular classification of tumors from routine histology with accuracy that matches or exceeds human pathologists. Discerning how neural networks make their predictions remains a significant challenge, but explainability tools help provide insights into what models have learned when corresponding histologic features are poorly defined. Here, we present a method for improving explainability of DNN models using synthetic histology generated by a conditional generative adversarial network (cGAN). We show that cGANs generate high-quality synthetic histology images that can be leveraged for explaining DNN models trained to classify molecularly-subtyped tumors, exposing histologic features associated with molecular state. Fine-tuning synthetic histology through class and layer blending illustrates nuanced morphologic differences between tumor subtypes. Finally, we demonstrate the use of synthetic histology for augmenting pathologist-in-training education, showing that these intuitive visualizations can reinforce and improve understanding of histologic manifestations of tumor biology.
translated by 谷歌翻译
我们在理论和实践之间的差距中介绍了我们的发现,即使用条件能量的模型(EBM)作为行为联系政策的隐式表示。我们还阐明了以前的工作中的几个微妙的,可能令人困惑的细节,以帮助未来的研究。我们指出无条件和有条件的EBM之间的关键差异,并警告说,盲目地将培训方法应用于另一个,可能会导致不良结果,这些结果并不能很好地概括。最后,我们强调了最大相互信息原理作为在条件EBM中获得良好概括作为回归任务模型的必要条件的重要性。
translated by 谷歌翻译
Dataset distillation has emerged as a prominent technique to improve data efficiency when training machine learning models. It encapsulates the knowledge from a large dataset into a smaller synthetic dataset. A model trained on this smaller distilled dataset can attain comparable performance to a model trained on the original training dataset. However, the existing dataset distillation techniques mainly aim at achieving the best trade-off between resource usage efficiency and model utility. The security risks stemming from them have not been explored. This study performs the first backdoor attack against the models trained on the data distilled by dataset distillation models in the image domain. Concretely, we inject triggers into the synthetic data during the distillation procedure rather than during the model training stage, where all previous attacks are performed. We propose two types of backdoor attacks, namely NAIVEATTACK and DOORPING. NAIVEATTACK simply adds triggers to the raw data at the initial distillation phase, while DOORPING iteratively updates the triggers during the entire distillation procedure. We conduct extensive evaluations on multiple datasets, architectures, and dataset distillation techniques. Empirical evaluation shows that NAIVEATTACK achieves decent attack success rate (ASR) scores in some cases, while DOORPING reaches higher ASR scores (close to 1.0) in all cases. Furthermore, we conduct a comprehensive ablation study to analyze the factors that may affect the attack performance. Finally, we evaluate multiple defense mechanisms against our backdoor attacks and show that our attacks can practically circumvent these defense mechanisms.
translated by 谷歌翻译
Blind image quality assessment (BIQA) remains challenging due to the diversity of distortion and image content variation, which complicate the distortion patterns crossing different scales and aggravate the difficulty of the regression problem for BIQA. However, existing BIQA methods often fail to consider multi-scale distortion patterns and image content, and little research has been done on learning strategies to make the regression model produce better performance. In this paper, we propose a simple yet effective Progressive Multi-Task Image Quality Assessment (PMT-IQA) model, which contains a multi-scale feature extraction module (MS) and a progressive multi-task learning module (PMT), to help the model learn complex distortion patterns and better optimize the regression issue to align with the law of human learning process from easy to hard. To verify the effectiveness of the proposed PMT-IQA model, we conduct experiments on four widely used public datasets, and the experimental results indicate that the performance of PMT-IQA is superior to the comparison approaches, and both MS and PMT modules improve the model's performance.
translated by 谷歌翻译
The development of social media user stance detection and bot detection methods rely heavily on large-scale and high-quality benchmarks. However, in addition to low annotation quality, existing benchmarks generally have incomplete user relationships, suppressing graph-based account detection research. To address these issues, we propose a Multi-Relational Graph-Based Twitter Account Detection Benchmark (MGTAB), the first standardized graph-based benchmark for account detection. To our knowledge, MGTAB was built based on the largest original data in the field, with over 1.55 million users and 130 million tweets. MGTAB contains 10,199 expert-annotated users and 7 types of relationships, ensuring high-quality annotation and diversified relations. In MGTAB, we extracted the 20 user property features with the greatest information gain and user tweet features as the user features. In addition, we performed a thorough evaluation of MGTAB and other public datasets. Our experiments found that graph-based approaches are generally more effective than feature-based approaches and perform better when introducing multiple relations. By analyzing experiment results, we identify effective approaches for account detection and provide potential future research directions in this field. Our benchmark and standardized evaluation procedures are freely available at: https://github.com/GraphDetec/MGTAB.
translated by 谷歌翻译
Given the increasingly intricate forms of partial differential equations (PDEs) in physics and related fields, computationally solving PDEs without analytic solutions inevitably suffers from the trade-off between accuracy and efficiency. Recent advances in neural operators, a kind of mesh-independent neural-network-based PDE solvers, have suggested the dawn of overcoming this challenge. In this emerging direction, Koopman neural operator (KNO) is a representative demonstration and outperforms other state-of-the-art alternatives in terms of accuracy and efficiency. Here we present KoopmanLab, a self-contained and user-friendly PyTorch module of the Koopman neural operator family for solving partial differential equations. Beyond the original version of KNO, we develop multiple new variants of KNO based on different neural network architectures to improve the general applicability of our module. These variants are validated by mesh-independent and long-term prediction experiments implemented on representative PDEs (e.g., the Navier-Stokes equation and the Bateman-Burgers equation) and ERA5 (i.e., one of the largest high-resolution data sets of global-scale climate fields). These demonstrations suggest the potential of KoopmanLab to be considered in diverse applications of partial differential equations.
translated by 谷歌翻译
A recent study has shown a phenomenon called neural collapse in that the within-class means of features and the classifier weight vectors converge to the vertices of a simplex equiangular tight frame at the terminal phase of training for classification. In this paper, we explore the corresponding structures of the last-layer feature centers and classifiers in semantic segmentation. Based on our empirical and theoretical analysis, we point out that semantic segmentation naturally brings contextual correlation and imbalanced distribution among classes, which breaks the equiangular and maximally separated structure of neural collapse for both feature centers and classifiers. However, such a symmetric structure is beneficial to discrimination for the minor classes. To preserve these advantages, we introduce a regularizer on feature centers to encourage the network to learn features closer to the appealing structure in imbalanced semantic segmentation. Experimental results show that our method can bring significant improvements on both 2D and 3D semantic segmentation benchmarks. Moreover, our method ranks 1st and sets a new record (+6.8% mIoU) on the ScanNet200 test leaderboard. Code will be available at https://github.com/dvlab-research/Imbalanced-Learning.
translated by 谷歌翻译
Benefiting from the intrinsic supervision information exploitation capability, contrastive learning has achieved promising performance in the field of deep graph clustering recently. However, we observe that two drawbacks of the positive and negative sample construction mechanisms limit the performance of existing algorithms from further improvement. 1) The quality of positive samples heavily depends on the carefully designed data augmentations, while inappropriate data augmentations would easily lead to the semantic drift and indiscriminative positive samples. 2) The constructed negative samples are not reliable for ignoring important clustering information. To solve these problems, we propose a Cluster-guided Contrastive deep Graph Clustering network (CCGC) by mining the intrinsic supervision information in the high-confidence clustering results. Specifically, instead of conducting complex node or edge perturbation, we construct two views of the graph by designing special Siamese encoders whose weights are not shared between the sibling sub-networks. Then, guided by the high-confidence clustering information, we carefully select and construct the positive samples from the same high-confidence cluster in two views. Moreover, to construct semantic meaningful negative sample pairs, we regard the centers of different high-confidence clusters as negative samples, thus improving the discriminative capability and reliability of the constructed sample pairs. Lastly, we design an objective function to pull close the samples from the same cluster while pushing away those from other clusters by maximizing and minimizing the cross-view cosine similarity between positive and negative samples. Extensive experimental results on six datasets demonstrate the effectiveness of CCGC compared with the existing state-of-the-art algorithms.
translated by 谷歌翻译