图像分类的深卷卷神经网络(CNN)依次交替交替进行卷积和下采样操作,例如合并层或陷入困境的卷积,从而导致较低的分辨率特征网络越深。这些降采样操作节省了计算资源,并在下一层提供了一些翻译不变性以及更大的接收领域。但是,这样做的固有副作用是,在网络深端产生的高级特征始终以低分辨率特征图捕获。逆也是如此,因为浅层总是包含小规模的特征。在生物医学图像分析中,工程师通常负责对仅包含有限信息的非常小的图像贴片进行分类。从本质上讲,这些补丁甚至可能不包含对象,而分类取决于图像纹理中未知量表的微妙基础模式的检测。在这些情况下,每一个信息都是有价值的。因此,重要的是要提取最大数量的信息功能。在这些考虑因素的推动下,我们引入了一种新的CNN体​​系结构,该体系结构可通过利用跳过连接以及连续的收缩和特征图的扩展来保留深,中间和浅层层的多尺度特征。使用来自胰腺导管腺癌(PDAC)CT扫描的非常低分辨率斑块的数据集,我们证明我们的网络可以超越最新模型的当前状态。
translated by 谷歌翻译
语义分割是将类标签分配给图像中每个像素的问题,并且是自动车辆视觉堆栈的重要组成部分,可促进场景的理解和对象检测。但是,许多表现最高的语义分割模型非常复杂且笨拙,因此不适合在计算资源有限且低延迟操作的板载自动驾驶汽车平台上部署。在这项调查中,我们彻底研究了旨在通过更紧凑,更有效的模型来解决这种未对准的作品,该模型能够在低内存嵌入式系统上部署,同时满足实时推理的限制。我们讨论了该领域中最杰出的作品,根据其主要贡献将它们置于分类法中,最后我们评估了在一致的硬件和软件设置下,所讨论模型的推理速度,这些模型代表了具有高端的典型研究环境GPU和使用低内存嵌入式GPU硬件的现实部署方案。我们的实验结果表明,许多作品能够在资源受限的硬件上实时性能,同时说明延迟和准确性之间的一致权衡。
translated by 谷歌翻译
Semantic image segmentation is a basic street scene understanding task in autonomous driving, where each pixel in a high resolution image is categorized into a set of semantic labels. Unlike other scenarios, objects in autonomous driving scene exhibit very large scale changes, which poses great challenges for high-level feature representation in a sense that multi-scale information must be correctly encoded. To remedy this problem, atrous convolution [14] was introduced to generate features with larger receptive fields without sacrificing spatial resolution. Built upon atrous convolution, Atrous Spatial Pyramid Pooling (ASPP) [2] was proposed to concatenate multiple atrous-convolved features using different dilation rates into a final feature representation. Although ASPP is able to generate multi-scale features, we argue the feature resolution in the scale-axis is not dense enough for the autonomous driving scenario. To this end, we propose Densely connected Atrous Spatial Pyramid Pooling (DenseASPP), which connects a set of atrous convolutional layers in a dense way, such that it generates multi-scale features that not only cover a larger scale range, but also cover that scale range densely, without significantly increasing the model size. We evaluate DenseASPP on the street scene benchmark Cityscapes [4] and achieve state-of-the-art performance.
translated by 谷歌翻译
新的SARS-COV-2大流行病也被称为Covid-19一直在全世界蔓延,导致生活猖獗。诸如CT,X射线等的医学成像在通过呈现器官功能的视觉表示来诊断患者时起着重要作用。然而,对于任何分析这种扫描的放射科学家是一种乏味且耗时的任务。新兴的深度学习技术展示了它的优势,在分析诸如Covid-19等疾病和病毒的速度更快的诊断中有助于帮助。在本文中,提出了一种基于自动化的基于深度学习的模型CoVID-19层级分割网络(CHS-Net),其用作语义层次分段器,以通过使用两个级联的CT医学成像来识别来自肺轮廓的Covid-19受感染的区域剩余注意力撤销U-NET(RAIU-Net)模型。 Raiu-net包括具有频谱空间和深度关注网络(SSD)的剩余成立U-Net模型,该网络(SSD)是由深度可分离卷积和混合池(MAX和频谱池)的收缩和扩展阶段开发的,以有效地编码和解码语义和不同的分辨率信息。 CHS-NET接受了分割损失函数的培训,该损失函数是二进制交叉熵损失和骰子损失的平均值,以惩罚假阴性和假阳性预测。将该方法与最近提出的方法进行比较,并使用标准度量评估,如准确性,精度,特异性,召回,骰子系数和jaccard相似度以及与Gradcam ++和不确定性地图的模型预测的可视化解释。随着广泛的试验,观察到所提出的方法优于最近提出的方法,并有效地将Covid-19受感染的地区进行肺部。
translated by 谷歌翻译
Australian Centre for Robotic Vision {guosheng.lin;anton.milan;chunhua.shen;
translated by 谷歌翻译
深度学习技术的进步为生物医学图像分析应用产生了巨大的贡献。随着乳腺癌是女性中最致命的疾病,早期检测是提高生存能力的关键手段。如超声波的医学成像呈现出色器官功能的良好视觉表现;然而,对于任何分析这种扫描的放射科学家,这种扫描是挑战和耗时,这延迟了诊断过程。虽然提出了各种深度学习的方法,但是通过乳房超声成像介绍了具有最有效的残余交叉空间关注引导u-Net(RCA-IUnet)模型的最小训练参数,以进一步改善肿瘤分割不同肿瘤尺寸的分割性能。 RCA-IUNET模型跟随U-Net拓扑,剩余初始化深度可分离卷积和混合池(MAX池和光谱池)层。此外,添加了交叉空间注意滤波器以抑制无关的特征并专注于目标结构。建议模型的分割性能在使用标准分割评估指标的两个公共数据集上验证,其中它表现出其他最先进的分段模型。
translated by 谷歌翻译
$ $With recent advances in CNNs, exceptional improvements have been made in semantic segmentation of high resolution images in terms of accuracy and latency. However, challenges still remain in detecting objects in crowded scenes, large scale variations, partial occlusion, and distortions, while still maintaining mobility and latency. We introduce a fast and efficient convolutional neural network, ASBU-Net, for semantic segmentation of high resolution images that addresses these problems and uses no novelty layers for ease of quantization and embedded hardware support. ASBU-Net is based on a new feature extraction module, atrous space bender layer (ASBL), which is efficient in terms of computation and memory. The ASB layers form a building block that is used to make ASBNet. Since this network does not use any special layers it can be easily implemented, quantized and deployed on FPGAs and other hardware with limited memory. We present experiments on resource and accuracy trade-offs and show strong performance compared to other popular models.
translated by 谷歌翻译
Semantic segmentation is a challenging task that addresses most of the perception needs of Intelligent Vehicles (IV) in an unified way. Deep Neural Networks excel at this task, as they can be trained end-to-end to accurately classify multiple object categories in an image at pixel level. However, a good trade-off between high quality and computational resources is yet not present in state-of-the-art semantic segmentation approaches, limiting their application in real vehicles. In this paper, we propose a deep architecture that is able to run in real-time while providing accurate semantic segmentation. The core of our architecture is a novel layer that uses residual connections and factorized convolutions in order to remain efficient while retaining remarkable accuracy. Our approach is able to run at over 83 FPS in a single Titan X, and 7 FPS in a Jetson TX1 (embedded GPU). A comprehensive set of experiments on the publicly available Cityscapes dataset demonstrates that our system achieves an accuracy that is similar to the state of the art, while being orders of magnitude faster to compute than other architectures that achieve top precision. The resulting trade-off makes our model an ideal approach for scene understanding in IV applications. The code is publicly available at: https://github.com/Eromera/erfnet
translated by 谷歌翻译
精确分割器官 - 危险(OARS)是优化放射治疗计划的先驱。现有的基于深度学习的多尺度融合体系结构已显示出2D医疗图像分割的巨大能力。他们成功的关键是汇总全球环境并保持高分辨率表示。但是,当转化为3D分割问题时,由于其大量的计算开销和大量数据饮食,现有的多尺度融合体系结构可能表现不佳。为了解决此问题,我们提出了一个新的OAR分割框架,称为Oarfocalfusenet,该框架融合了多尺度功能,并采用焦点调制来捕获多个尺度的全局本地上下文。每个分辨率流都具有来自不同分辨率量表的特征,并且多尺度信息汇总到模型多样化的上下文范围。结果,功能表示将进一步增强。在我们的实验设置中与OAR分割以及多器官分割的全面比较表明,我们提出的Oarfocalfusenet在公开可用的OpenKBP数据集和Synapse Multi-Organ细分方面的最新最新方法优于最新的最新方法。在标准评估指标方面,提出的两种方法(3D-MSF和Oarfocalfusenet)均表现出色。我们的最佳性能方法(Oarfocalfusenet)在OpenKBP数据集上获得的骰子系数为0.7995,Hausdorff的距离为5.1435,而Synapse Multi-Organ分段数据集则获得了0.8137的骰子系数。
translated by 谷歌翻译
我们提出了层饱和 - 一种简单的在线可计算的方法,用于分析神经网络中的信息处理。首先,我们表明层的输出可以限制在没有性能损失的方差矩阵的eIgenspace。我们提出了一种计算上的轻量级方法,用于在训练期间近似方差矩阵。从其无损EIGenspace的维度我们推导了层饱和度 - eIGenspace尺寸和层宽度之间的比率。我们表明饱和度似乎表明哪个层有助于网络性能。我们通过改变网络深度,滤波器大小和输入分辨率,展示如何改变神经网络中的层饱和度。此外,我们表明,通过在网络上更均匀地分配推动过程,所选择的输入分辨率提高了网络性能。
translated by 谷歌翻译
近年来,基于复杂的卷积神经网络架构的越来越复杂的方法一直在缓慢推动良好的基准数据集的性能。在本文中,我们返回返回检查真正需要这种复杂性。我们呈现RC-Net,一个完全卷积的网络,其中每层过滤器数量被优化,以减少特征重叠和复杂性。我们还使用跳过连接来将空间信息丢失保持为最小,通过将网络中的汇集操作保持到最小。在我们的实验中使用了两个公开的视网膜血管分段数据集。在我们的实验中,RC-Net是非常有竞争力的,表现优于替代方案的分割方法,具有两种甚至三个数量级的训练参数。
translated by 谷歌翻译
卷积神经网络(CNN)的深度学习体系结构在计算机视野领域取得了杰出的成功。 CNN构建的编码器架构U-Net在生物医学图像分割方面取得了重大突破,并且已在各种实用的情况下应用。但是,编码器部分中每个下采样层和简单堆积的卷积的平等设计不允许U-NET从不同深度提取足够的特征信息。医学图像的复杂性日益增加为现有方法带来了新的挑战。在本文中,我们提出了一个更深层,更紧凑的分裂注意U形网络(DCSAU-NET),该网络有效地利用了基于两个新颖框架的低级和高级语义信息:主要功能保护和紧凑的分裂注意力堵塞。我们评估了CVC-ClinicDB,2018 Data Science Bowl,ISIC-2018和SEGPC-2021数据集的建议模型。结果,DCSAU-NET在联合(MIOU)和F1-SOCRE的平均交点方面显示出比其他最先进的方法(SOTA)方法更好的性能。更重要的是,提出的模型在具有挑战性的图像上表现出了出色的细分性能。我们的工作代码以及更多技术细节,请访问https://github.com/xq141839/dcsau-net。
translated by 谷歌翻译
Visual recognition requires rich representations that span levels from low to high, scales from small to large, and resolutions from fine to coarse. Even with the depth of features in a convolutional network, a layer in isolation is not enough: compounding and aggregating these representations improves inference of what and where. Architectural efforts are exploring many dimensions for network backbones, designing deeper or wider architectures, but how to best aggregate layers and blocks across a network deserves further attention. Although skip connections have been incorporated to combine layers, these connections have been "shallow" themselves, and only fuse by simple, one-step operations. We augment standard architectures with deeper aggregation to better fuse information across layers. Our deep layer aggregation structures iteratively and hierarchically merge the feature hierarchy to make networks with better accuracy and fewer parameters. Experiments across architectures and tasks show that deep layer aggregation improves recognition and resolution compared to existing branching and merging schemes.
translated by 谷歌翻译
白内障手术中的语义分割具有广泛的应用,可导致外科结果增强和降低临床风险。但是,在这些手术中分割不同相关结构的不同问题使得指定独特的网络非常具有挑战性。本文提出了一个语义分割网络,称为Deeppyramid,可以使用三个新颖性来应对这些挑战:(1)金字塔视图融合模块,该模块可在输入卷积中每个像素位置的周围区域中提供不同的角度的全球视图功能图; (2)一个可变形的金字塔接收模块,该模块可实现一个可适应感兴趣对象的几何变换的广泛可变形接收场; (3)专用的金字塔损失,可自适应监督多尺度语义特征图。结合在一起,我们表明这些模块可以有效地提高语义分割性能,尤其是在对象中透明度,可变形性,可伸缩性和钝边缘的情况下。我们证明我们的方法在最先进的级别上执行,并且优于许多现有方法,其利润率很高(与最佳竞争对手的方法相比,联合的交叉路口总体改善为3.66%)。
translated by 谷歌翻译
被广泛采用的缩减采样是为了在视觉识别的准确性和延迟之间取得良好的权衡。不幸的是,没有学习常用的合并层,因此无法保留重要信息。作为另一个降低方法,自适应采样权重和与任务相关的过程区域,因此能够更好地保留有用的信息。但是,自适应采样的使用仅限于某些层。在本文中,我们表明,在深神经网络的构件中使用自适应采样可以提高其效率。特别是,我们提出了SSBNET,该SSBNET是通过将采样层反复插入Resnet等现有网络构建的。实验结果表明,所提出的SSBNET可以在ImageNet和可可数据集上实现竞争性图像分类和对象检测性能。例如,SSB-Resnet-RS-200在Imagenet数据集上的精度达到82.6%,比基线RESNET-RS-152高0.6%,具有相似的复杂性。可视化显示了SSBNET在允许不同层专注于不同位置的优势,而消融研究进一步验证了自适应采样比均匀方法的优势。
translated by 谷歌翻译
In this work we address the task of semantic image segmentation with Deep Learning and make three main contributions that are experimentally shown to have substantial practical merit. First, we highlight convolution with upsampled filters, or 'atrous convolution', as a powerful tool in dense prediction tasks. Atrous convolution allows us to explicitly control the resolution at which feature responses are computed within Deep Convolutional Neural Networks. It also allows us to effectively enlarge the field of view of filters to incorporate larger context without increasing the number of parameters or the amount of computation. Second, we propose atrous spatial pyramid pooling (ASPP) to robustly segment objects at multiple scales. ASPP probes an incoming convolutional feature layer with filters at multiple sampling rates and effective fields-of-views, thus capturing objects as well as image context at multiple scales. Third, we improve the localization of object boundaries by combining methods from DCNNs and probabilistic graphical models. The commonly deployed combination of max-pooling and downsampling in DCNNs achieves invariance but has a toll on localization accuracy. We overcome this by combining the responses at the final DCNN layer with a fully connected Conditional Random Field (CRF), which is shown both qualitatively and quantitatively to improve localization performance. Our proposed "DeepLab" system sets the new state-of-art at the PASCAL VOC-2012 semantic image segmentation task, reaching 79.7% mIOU in the test set, and advances the results on three other datasets: PASCAL-Context, PASCAL-Person-Part, and Cityscapes. All of our code is made publicly available online.
translated by 谷歌翻译
The lack of efficient segmentation methods and fully-labeled datasets limits the comprehensive assessment of optical coherence tomography angiography (OCTA) microstructures like retinal vessel network (RVN) and foveal avascular zone (FAZ), which are of great value in ophthalmic and systematic diseases evaluation. Here, we introduce an innovative OCTA microstructure segmentation network (OMSN) by combining an encoder-decoder-based architecture with multi-scale skip connections and the split-attention-based residual network ResNeSt, paying specific attention to OCTA microstructural features while facilitating better model convergence and feature representations. The proposed OMSN achieves excellent single/multi-task performances for RVN or/and FAZ segmentation. Especially, the evaluation metrics on multi-task models outperform single-task models on the same dataset. On this basis, a fully annotated retinal OCTA segmentation (FAROS) dataset is constructed semi-automatically, filling the vacancy of a pixel-level fully-labeled OCTA dataset. OMSN multi-task segmentation model retrained with FAROS further certifies its outstanding accuracy for simultaneous RVN and FAZ segmentation.
translated by 谷歌翻译
我们分享了我们最近的发现,以试图培训通用分割网络的各种细胞类型和成像方式。我们的方法建立在广义的U-NET体系结构上,该体系结构允许单独评估每个组件。我们修改了传统的二进制培训目标,以包括三个类以进行直接实例细分。进行了有关培训方案,培训设置,网络骨架和各个模块的详细实验。我们提出的培训方案依次从每个数据集中吸取小匹配,并且在优化步骤之前积累了梯度。我们发现,培训通用网络的关键是所有数据集上的历史监督,并且有必要以公正的方式对每个数据集进行采样。我们的实验还表明,可能存在共同的特征来定义细胞类型和成像方式的细胞边界,这可以允许应用训练有素的模型完全看不见的数据集。一些培训技巧可以进一步提高细分性能,包括交叉渗透损失功能中的班级权重,精心设计的学习率调度程序,较大的图像作物以进行上下文信息以及不平衡类别的其他损失条款。我们还发现,由于它们更可靠的统计估计和更高的语义理解,分割性能可以受益于组规范化层和缺陷的空间金字塔池模块。我们参与了在IEEE国际生物医学成像研讨会(ISBI)2021举行的第六个细胞跟踪挑战(CTC)。我们的方法被评估为在主要曲目的初始提交期间,作为最佳亚军,并在额外的竞争中获得了第三名,以准备摘要出版物。
translated by 谷歌翻译
Mitosis nuclei count is one of the important indicators for the pathological diagnosis of breast cancer. The manual annotation needs experienced pathologists, which is very time-consuming and inefficient. With the development of deep learning methods, some models with good performance have emerged, but the generalization ability should be further strengthened. In this paper, we propose a two-stage mitosis segmentation and classification method, named SCMitosis. Firstly, the segmentation performance with a high recall rate is achieved by the proposed depthwise separable convolution residual block and channel-spatial attention gate. Then, a classification network is cascaded to further improve the detection performance of mitosis nuclei. The proposed model is verified on the ICPR 2012 dataset, and the highest F-score value of 0.8687 is obtained compared with the current state-of-the-art algorithms. In addition, the model also achieves good performance on GZMH dataset, which is prepared by our group and will be firstly released with the publication of this paper. The code will be available at: https://github.com/antifen/mitosis-nuclei-segmentation.
translated by 谷歌翻译
Image segmentation is a key topic in image processing and computer vision with applications such as scene understanding, medical image analysis, robotic perception, video surveillance, augmented reality, and image compression, among many others. Various algorithms for image segmentation have been developed in the literature. Recently, due to the success of deep learning models in a wide range of vision applications, there has been a substantial amount of works aimed at developing image segmentation approaches using deep learning models. In this survey, we provide a comprehensive review of the literature at the time of this writing, covering a broad spectrum of pioneering works for semantic and instance-level segmentation, including fully convolutional pixel-labeling networks, encoder-decoder architectures, multi-scale and pyramid based approaches, recurrent networks, visual attention models, and generative models in adversarial settings. We investigate the similarity, strengths and challenges of these deep learning models, examine the most widely used datasets, report performances, and discuss promising future research directions in this area.
translated by 谷歌翻译