我们提出了Sauron,这是一种过滤器修剪方法,它通过使用自动调整的层特异性阈值丢弃相应的过滤器来消除冗余特征图。此外,Sauron最大程度地减少了一个正规化术语,正如我们所显示的各种指标所显示的那样,促进了特征地图簇的形成。与大多数过滤器修剪方法相反,Sauron是单相,类似于典型的神经网络优化,需要更少的超参数和设计决策。此外,与其他基于群集的方法不同,我们的方法不需要预选簇的数量,而簇的数量是非平凡的,以确定和随着层的变化。我们在三个医学图像分割任务上评估了Sauron和三种最先进的过滤器修剪方法。在这个领域,过滤器修剪很少受到关注,并且可以帮助建立有效的医疗级计算机模型,这些计算机由于隐私考虑而无法使用云服务。索伦(Sauron)比竞争的修剪方法实现了具有更高性能和修剪率的模型。此外,由于Sauron在训练过程中除去过滤器,因此随着时间的推移,其优化加速了。最后,我们证明了Sauron-Prun的模型的特征地图是高度可解释的。 Sauron代码可在https://github.com/jmlipman/sauronunet上公开获得。
translated by 谷歌翻译
过滤器修剪方法通过去除选定的过滤器来引入结构稀疏性,因此对于降低复杂性特别有效。先前的作品从验证较小规范的过滤器的角度从经验修剪网络中造成了较小的最终结果贡献。但是,此类标准已被证明对过滤器的分布敏感,并且由于修剪后的容量差距是固定的,因此准确性可能很难恢复。在本文中,我们提出了一种称为渐近软簇修剪(ASCP)的新型过滤器修剪方法,以根据过滤器的相似性来识别网络的冗余。首先通过聚类来区分来自参数过度的网络的每个过滤器,然后重建以手动将冗余引入其中。提出了一些聚类指南,以更好地保留特征提取能力。重建后,允许更新过滤器,以消除错误选择的效果。此外,还采用了各种修剪率的衰减策略来稳定修剪过程并改善最终性能。通过逐渐在每个群集中生成更相同的过滤器,ASCP可以通过通道添加操作将其删除,几乎没有准确性下降。 CIFAR-10和Imagenet数据集的广泛实验表明,与许多最新算法相比,我们的方法可以取得竞争性结果。
translated by 谷歌翻译
The success of CNNs in various applications is accompanied by a significant increase in the computation and parameter storage costs. Recent efforts toward reducing these overheads involve pruning and compressing the weights of various layers without hurting original accuracy. However, magnitude-based pruning of weights reduces a significant number of parameters from the fully connected layers and may not adequately reduce the computation costs in the convolutional layers due to irregular sparsity in the pruned networks. We present an acceleration method for CNNs, where we prune filters from CNNs that are identified as having a small effect on the output accuracy. By removing whole filters in the network together with their connecting feature maps, the computation costs are reduced significantly. In contrast to pruning weights, this approach does not result in sparse connectivity patterns. Hence, it does not need the support of sparse convolution libraries and can work with existing efficient BLAS libraries for dense matrix multiplications. We show that even simple filter pruning techniques can reduce inference costs for VGG-16 by up to 34% and ResNet-110 by up to 38% on CIFAR10 while regaining close to the original accuracy by retraining the networks.
translated by 谷歌翻译
We propose a new formulation for pruning convolutional kernels in neural networks to enable efficient inference. We interleave greedy criteria-based pruning with finetuning by backpropagation-a computationally efficient procedure that maintains good generalization in the pruned network. We propose a new criterion based on Taylor expansion that approximates the change in the cost function induced by pruning network parameters. We focus on transfer learning, where large pretrained networks are adapted to specialized tasks. The proposed criterion demonstrates superior performance compared to other criteria, e.g. the norm of kernel weights or feature map activation, for pruning large CNNs after adaptation to fine-grained classification tasks (Birds-200 and Flowers-102) relaying only on the first order gradient information. We also show that pruning can lead to more than 10× theoretical reduction in adapted 3D-convolutional filters with a small drop in accuracy in a recurrent gesture classifier. Finally, we show results for the largescale ImageNet dataset to emphasize the flexibility of our approach.
translated by 谷歌翻译
医学成像深度学习模型通常是大而复杂的,需要专门的硬件来训练和评估这些模型。为了解决此类问题,我们提出了PocketNet范式,以减少深度学习模型的规模,通过促进卷积神经网络中的渠道数量的增长。我们证明,对于一系列的分割和分类任务,PocketNet架构产生的结果与常规神经网络相当,同时将参数数量减少多个数量级,最多使用90%的GPU记忆,并加快训练时间的加快。高达40%,从而允许在资源约束设置中培训和部署此类模型。
translated by 谷歌翻译
卷积神经网络(CNN)具有一定量的参数冗余,滤波器修剪旨在去除冗余滤波器,并提供在终端设备上应用CNN的可能性。但是,以前的作品更加注重设计了滤波器重要性的评估标准,然后缩短了具有固定修剪率的重要滤波器或固定数量,以减少卷积神经网络的冗余。它不考虑为每层预留有多少筛选器是最合理的选择。从这个角度来看,我们通过搜索适当的过滤器(SNF)来提出新的过滤器修剪方法。 SNF专用于搜索每层的最合理的保留过滤器,然后是具有特定标准的修剪过滤器。它可以根据不同的拖鞋定制最合适的网络结构。通过我们的方法进行过滤器修剪导致CIFAR-10的最先进(SOTA)精度,并在Imagenet ILSVRC-2012上实现了竞争性能。基于Reset-56网络,在Top-中增加了0.14%的增加0.14% 1对CIFAR-10拖出的52.94%的精度为52.94%。在减少68.68%拖鞋时,CiFar-10上的修剪Resnet-110还提高了0.03%的1 0.03%的精度。对于Imagenet,我们将修剪速率设置为52.10%的拖鞋,前1个精度只有0.74%。该代码可以在https://github.com/pk-l/snf上获得。
translated by 谷歌翻译
Deep convolutional neural networks (CNNs) have been widely used for medical image segmentation. In most studies, only the output layer is exploited to compute the final segmentation results and the hidden representations of the deep learned features have not been well understood. In this paper, we propose a prototype segmentation (ProtoSeg) method to compute a binary segmentation map based on deep features. We measure the segmentation abilities of the features by computing the Dice between the feature segmentation map and ground-truth, named as the segmentation ability score (SA score for short). The corresponding SA score can quantify the segmentation abilities of deep features in different layers and units to understand the deep neural networks for segmentation. In addition, our method can provide a mean SA score which can give a performance estimation of the output on the test images without ground-truth. Finally, we use the proposed ProtoSeg method to compute the segmentation map directly on input images to further understand the segmentation ability of each input image. Results are presented on segmenting tumors in brain MRI, lesions in skin images, COVID-related abnormality in CT images, prostate segmentation in abdominal MRI, and pancreatic mass segmentation in CT images. Our method can provide new insights for interpreting and explainable AI systems for medical image segmentation. Our code is available on: \url{https://github.com/shengfly/ProtoSeg}.
translated by 谷歌翻译
网络压缩对于使深网的效率更高,更快且可推广到低端硬件至关重要。当前的网络压缩方法有两个开放问题:首先,缺乏理论框架来估计最大压缩率;其次,有些层可能会过多地进行,从而导致网络性能大幅下降。为了解决这两个问题,这项研究提出了一种基于梯度矩阵分析方法,以估计最大网络冗余。在最大速率的指导下,开发了一种新颖而有效的层次网络修剪算法,以最大程度地凝结神经元网络结构而无需牺牲网络性能。进行实质性实验以证明新方法修剪几个高级卷积神经网络(CNN)体系结构的功效。与现有的修剪方法相比,拟议的修剪算法实现了最先进的性能。与其他方法相比,在相同或相似的压缩比下,新方法提供了最高的网络预测准确性。
translated by 谷歌翻译
Previous works utilized "smaller-norm-less-important" criterion to prune filters with smaller norm values in a convolutional neural network. In this paper, we analyze this norm-based criterion and point out that its effectiveness depends on two requirements that are not always met: (1) the norm deviation of the filters should be large; (2) the minimum norm of the filters should be small. To solve this problem, we propose a novel filter pruning method, namely Filter Pruning via Geometric Median (FPGM), to compress the model regardless of those two requirements. Unlike previous methods, FPGM compresses CNN models by pruning filters with redundancy, rather than those with "relatively less" importance. When applied to two image classification benchmarks, our method validates its usefulness and strengths. Notably, on CIFAR-10, FPGM reduces more than 52% FLOPs on ResNet-110 with even 2.69% relative accuracy improvement. Moreover, on ILSVRC-2012, FPGM reduces more than 42% FLOPs on ResNet-101 without top-5 accuracy drop, which has advanced the state-of-the-art. Code is publicly available on GitHub: https://github.com/he-y/filter-pruning-geometric-median * Corresponding Author. Part of this work was done when Yi Yang was visiting Baidu Research during his Professional Experience Program.
translated by 谷歌翻译
In this paper, we introduce a new channel pruning method to accelerate very deep convolutional neural networks. Given a trained CNN model, we propose an iterative two-step algorithm to effectively prune each layer, by a LASSO regression based channel selection and least square reconstruction. We further generalize this algorithm to multi-layer and multi-branch cases. Our method reduces the accumulated error and enhance the compatibility with various architectures. Our pruned VGG-16 achieves the state-of-the-art results by 5× speed-up along with only 0.3% increase of error. More importantly, our method is able to accelerate modern networks like ResNet, Xception and suffers only 1.4%, 1.0% accuracy loss under 2× speedup respectively, which is significant. Code has been made publicly available 1 .
translated by 谷歌翻译
Convolutional Neural Networks (CNNs) with U-shaped architectures have dominated medical image segmentation, which is crucial for various clinical purposes. However, the inherent locality of convolution makes CNNs fail to fully exploit global context, essential for better recognition of some structures, e.g., brain lesions. Transformers have recently proven promising performance on vision tasks, including semantic segmentation, mainly due to their capability of modeling long-range dependencies. Nevertheless, the quadratic complexity of attention makes existing Transformer-based models use self-attention layers only after somehow reducing the image resolution, which limits the ability to capture global contexts present at higher resolutions. Therefore, this work introduces a family of models, dubbed Factorizer, which leverages the power of low-rank matrix factorization for constructing an end-to-end segmentation model. Specifically, we propose a linearly scalable approach to context modeling, formulating Nonnegative Matrix Factorization (NMF) as a differentiable layer integrated into a U-shaped architecture. The shifted window technique is also utilized in combination with NMF to effectively aggregate local information. Factorizers compete favorably with CNNs and Transformers in terms of accuracy, scalability, and interpretability, achieving state-of-the-art results on the BraTS dataset for brain tumor segmentation and ISLES'22 dataset for stroke lesion segmentation. Highly meaningful NMF components give an additional interpretability advantage to Factorizers over CNNs and Transformers. Moreover, our ablation studies reveal a distinctive feature of Factorizers that enables a significant speed-up in inference for a trained Factorizer without any extra steps and without sacrificing much accuracy. The code and models are publicly available at https://github.com/pashtari/factorizer.
translated by 谷歌翻译
The mainstream approach for filter pruning is usually either to force a hard-coded importance estimation upon a computation-heavy pretrained model to select "important" filters, or to impose a hyperparameter-sensitive sparse constraint on the loss objective to regularize the network training. In this paper, we present a novel filter pruning method, dubbed dynamic-coded filter fusion (DCFF), to derive compact CNNs in a computation-economical and regularization-free manner for efficient image classification. Each filter in our DCFF is firstly given an inter-similarity distribution with a temperature parameter as a filter proxy, on top of which, a fresh Kullback-Leibler divergence based dynamic-coded criterion is proposed to evaluate the filter importance. In contrast to simply keeping high-score filters in other methods, we propose the concept of filter fusion, i.e., the weighted averages using the assigned proxies, as our preserved filters. We obtain a one-hot inter-similarity distribution as the temperature parameter approaches infinity. Thus, the relative importance of each filter can vary along with the training of the compact CNN, leading to dynamically changeable fused filters without both the dependency on the pretrained model and the introduction of sparse constraints. Extensive experiments on classification benchmarks demonstrate the superiority of our DCFF over the compared counterparts. For example, our DCFF derives a compact VGGNet-16 with only 72.77M FLOPs and 1.06M parameters while reaching top-1 accuracy of 93.47% on CIFAR-10. A compact ResNet-50 is obtained with 63.8% FLOPs and 58.6% parameter reductions, retaining 75.60% top-1 accuracy on ILSVRC-2012. Our code, narrower models and training logs are available at https://github.com/lmbxmu/DCFF.
translated by 谷歌翻译
Neural network pruning offers a promising prospect to facilitate deploying deep neural networks on resourcelimited devices. However, existing methods are still challenged by the training inefficiency and labor cost in pruning designs, due to missing theoretical guidance of non-salient network components. In this paper, we propose a novel filter pruning method by exploring the High Rank of feature maps (HRank). Our HRank is inspired by the discovery that the average rank of multiple feature maps generated by a single filter is always the same, regardless of the number of image batches CNNs receive. Based on HRank, we develop a method that is mathematically formulated to prune filters with low-rank feature maps. The principle behind our pruning is that low-rank feature maps contain less information, and thus pruned results can be easily reproduced. Besides, we experimentally show that weights with high-rank feature maps contain more important information, such that even when a portion is not updated, very little damage would be done to the model performance. Without introducing any additional constraints, HRank leads to significant improvements over the state-of-the-arts in terms of FLOPs and parameters reduction, with similar accuracies. For example, with ResNet-110, we achieve a 58.2%-FLOPs reduction by removing 59.2% of the parameters, with only a small loss of 0.14% in top-1 accuracy on CIFAR-10. With Res-50, we achieve a 43.8%-FLOPs reduction by removing 36.7% of the parameters, with only a loss of 1.17% in the top-1 accuracy on ImageNet. The codes can be available at https://github.com/lmbxmu/HRank.
translated by 谷歌翻译
过滤器修剪的目标是搜索不重要的过滤器以删除以便使卷积神经网络(CNNS)有效而不牺牲过程中的性能。挑战在于找到可以帮助确定每个过滤器关于神经网络的最终输出的重要或相关的信息的信息。在这项工作中,我们分享了我们的观察说,预先训练的CNN的批量标准化(BN)参数可用于估计激活输出的特征分布,而无需处理训练数据。在观察时,我们通过基于预先训练的CNN的BN参数评估每个滤波器的重要性来提出简单而有效的滤波修剪方法。 CiFar-10和Imagenet的实验结果表明,该方法可以在准确性下降和计算复杂性的计算复杂性和降低的折衷方面具有和不进行微调的卓越性能。
translated by 谷歌翻译
我们为Brats21挑战中的脑肿瘤分割任务提出了优化的U-Net架构。为了找到最佳模型架构和学习时间表,我们运行了一个广泛的消融研究来测试:深度监督损失,焦点,解码器注意,下降块和残余连接。此外,我们搜索了U-Net编码器的最佳深度,卷积通道数量和后处理策略。我们的方法赢得了验证阶段,并在测试阶段进行了第三位。我们已开放源代码以在NVIDIA深度学习示例GitHub存储库中重现我们的Brats21提交。
translated by 谷歌翻译
The deployment of deep convolutional neural networks (CNNs) in many real world applications is largely hindered by their high computational cost. In this paper, we propose a novel learning scheme for CNNs to simultaneously 1) reduce the model size; 2) decrease the run-time memory footprint; and 3) lower the number of computing operations, without compromising accuracy. This is achieved by enforcing channel-level sparsity in the network in a simple but effective way. Different from many existing approaches, the proposed method directly applies to modern CNN architectures, introduces minimum overhead to the training process, and requires no special software/hardware accelerators for the resulting models. We call our approach network slimming, which takes wide and large networks as input models, but during training insignificant channels are automatically identified and pruned afterwards, yielding thin and compact models with comparable accuracy. We empirically demonstrate the effectiveness of our approach with several state-of-the-art CNN models, including VGGNet, ResNet and DenseNet, on various image classification datasets. For VGGNet, a multi-pass version of network slimming gives a 20× reduction in model size and a 5× reduction in computing operations.
translated by 谷歌翻译
小儿肌肉骨骼系统的临床诊断依赖于医学成像检查的分析。在医学图像处理管道中,使用深度学习算法的语义分割使人可以自动生成患者特定的三维解剖模型,这对于形态学评估至关重要。但是,小儿成像资源的稀缺性可能导致单个深层分割模型的准确性和泛化性能降低。在这项研究中,我们建议设计一个新型的多任务多任务多域学习框架,在该框架中,单个分割网络对由解剖学的不同部分产生的多个数据集进行了优化。与以前的方法不同,我们同时考虑多个强度域和分割任务来克服小儿数据的固有稀缺性,同时利用成像数据集之间的共享特征。为了进一步提高概括能力,我们从自然图像分类中采用了转移学习方案,以及旨在在共享表示中促进域特异性群集的多尺度对比正则化,以及多连接解剖学先验来执行解剖学上一致的预测。我们评估了使用脚踝,膝盖和肩关节的三个稀缺和小儿成像数据集进行骨分割的贡献。我们的结果表明,所提出的方法在骰子指标中的表现优于个人,转移和共享分割方案,并具有统计学上足够的利润。拟议的模型为智能使用成像资源和更好地管理小儿肌肉骨骼疾病提供了新的观点。
translated by 谷歌翻译
语义图像分割是手术中的背景知识和自治机器人的重要前提。本领域的状态专注于在微创手术期间获得的传统RGB视频数据,但基于光谱成像数据的全景语义分割并在开放手术期间获得几乎没有注意到日期。为了解决文献中的这种差距,我们正在研究基于在开放手术环境中获得的猪的高光谱成像(HSI)数据的以下研究问题:(1)基于神经网络的HSI数据的充分表示是完全自动化的器官分割,尤其是关于数据的空间粒度(像素与Superpixels与Patches与完整图像)的空间粒度? (2)在执行语义器官分割时,是否有利用HSI数据使用HSI数据,即RGB数据和处理的HSI数据(例如氧合等组织参数)?根据基于20猪的506个HSI图像的全面验证研究,共注释了19个类,基于深度的学习的分割性能 - 贯穿模态 - 与输入数据的空间上下文一致。未处理的HSI数据提供优于RGB数据或来自摄像机提供商的处理数据,其中优势随着输入到神经网络的输入的尺寸而增加。最大性能(应用于整个图像的HSI)产生了0.89(标准偏差(SD)0.04)的平均骰子相似度系数(DSC),其在帧间间变异性(DSC为0.89(SD 0.07)的范围内。我们得出结论,HSI可以成为全自动手术场景理解的强大的图像模型,其具有传统成像的许多优点,包括恢复额外功能组织信息的能力。
translated by 谷歌翻译
State-of-the-art brain tumor segmentation is based on deep learning models applied to multi-modal MRIs. Currently, these models are trained on images after a preprocessing stage that involves registration, interpolation, brain extraction (BE, also known as skull-stripping) and manual correction by an expert. However, for clinical practice, this last step is tedious and time-consuming and, therefore, not always feasible, resulting in skull-stripping faults that can negatively impact the tumor segmentation quality. Still, the extent of this impact has never been measured for any of the many different BE methods available. In this work, we propose an automatic brain tumor segmentation pipeline and evaluate its performance with multiple BE methods. Our experiments show that the choice of a BE method can compromise up to 15.7% of the tumor segmentation performance. Moreover, we propose training and testing tumor segmentation models on non-skull-stripped images, effectively discarding the BE step from the pipeline. Our results show that this approach leads to a competitive performance at a fraction of the time. We conclude that, in contrast to the current paradigm, training tumor segmentation models on non-skull-stripped images can be the best option when high performance in clinical practice is desired.
translated by 谷歌翻译
Brain tumor imaging has been part of the clinical routine for many years to perform non-invasive detection and grading of tumors. Tumor segmentation is a crucial step for managing primary brain tumors because it allows a volumetric analysis to have a longitudinal follow-up of tumor growth or shrinkage to monitor disease progression and therapy response. In addition, it facilitates further quantitative analysis such as radiomics. Deep learning models, in particular CNNs, have been a methodology of choice in many applications of medical image analysis including brain tumor segmentation. In this study, we investigated the main design aspects of CNN models for the specific task of MRI-based brain tumor segmentation. Two commonly used CNN architectures (i.e. DeepMedic and U-Net) were used to evaluate the impact of the essential parameters such as learning rate, batch size, loss function, and optimizer. The performance of CNN models using different configurations was assessed with the BraTS 2018 dataset to determine the most performant model. Then, the generalization ability of the model was assessed using our in-house dataset. For all experiments, U-Net achieved a higher DSC compared to the DeepMedic. However, the difference was only statistically significant for whole tumor segmentation using FLAIR sequence data and tumor core segmentation using T1w sequence data. Adam and SGD both with the initial learning rate set to 0.001 provided the highest segmentation DSC when training the CNN model using U-Net and DeepMedic architectures, respectively. No significant difference was observed when using different normalization approaches. In terms of loss functions, a weighted combination of soft Dice and cross-entropy loss with the weighting term set to 0.5 resulted in an improved segmentation performance and training stability for both DeepMedic and U-Net models.
translated by 谷歌翻译