Background and Purpose: Colorectal cancer is a common fatal malignancy, the fourth most common cancer in men, and the third most common cancer in women worldwide. Timely detection of cancer in its early stages is essential for treating the disease. Currently, there is a lack of datasets for histopathological image segmentation of rectal cancer, which often hampers the assessment accuracy when computer technology is used to aid in diagnosis. Methods: This present study provided a new publicly available Enteroscope Biopsy Histopathological Hematoxylin and Eosin Image Dataset for Image Segmentation Tasks (EBHI-Seg). To demonstrate the validity and extensiveness of EBHI-Seg, the experimental results for EBHI-Seg are evaluated using classical machine learning methods and deep learning methods. Results: The experimental results showed that deep learning methods had a better image segmentation performance when utilizing EBHI-Seg. The maximum accuracy of the Dice evaluation metric for the classical machine learning method is 0.948, while the Dice evaluation metric for the deep learning method is 0.965. Conclusion: This publicly available dataset contained 5,170 images of six types of tumor differentiation stages and the corresponding ground truth images. The dataset can provide researchers with new segmentation algorithms for medical diagnosis of colorectal cancer, which can be used in the clinical setting to help doctors and patients.
translated by 谷歌翻译
非负矩阵分解(NMF)已广泛用于降低机器学习的尺寸。但是,传统的NMF无法正确处理异常值,因此对噪声敏感。为了提高NMF的鲁棒性,本文提出了一种自适应加权NMF,它引入了权重,以强调每个数据点的不同重要性,因此降低了对噪声数据的算法敏感性。它与使用缓慢生长相似性度量的现有强大NMF大不相同。具体而言,提出了两种实现这一目标的策略:模糊加权技术和熵加权技术,两者都导致具有简单形式的迭代解决方案。实验结果表明,新方法在具有噪声的几个真实数据集上具有更健壮的特征表示,而不是进行噪声。
translated by 谷歌翻译
迭代加权收缩阈值算法(IWSTA)已经显示出优于经典的未加权迭代收缩 - 阈值算法(ISTA),用于解决线性逆问题,其不同地解决属性。本文提出了一种新的熵正则化IWSTA(ERIWSTA),该IWSTA(ERIWSTA)为成本函数增加了成本函数以衡量权重的不确定性,以刺激参与问题解决的属性。然后,用拉格朗日乘法器方法解决权重,以获得简单的迭代更新。可以解释权重作为问题解决方案的贡献的概率。CT图像恢复的实验结果表明,该方法在收敛速度和恢复精度方面具有比现有方法更好的性能。
translated by 谷歌翻译
非负矩阵分解(NMF)已被广泛用于学习数据的低维表示。但是,NMF对数据点的所有属性都同样关注,这不可避免地导致不准确的代表性。例如,在人面数据集中,如果图像在头上包含帽子,则应删除帽子,或者在矩阵分组期间应减少其对应属性的重要性。本文提出了一种名为熵权的NMF(EWNMF)的新型NMF,其为每个数据点的每个属性使用可优化的权重,以强调它们的重要性。通过向成本函数添加熵规范器来实现此过程,然后使用拉格朗日乘法器方法来解决问题。具有若干数据集的实验结果证明了该方法的可行性和有效性。我们在https://github.com/poisson-em/entropy-weighted-nmf提供我们的代码。
translated by 谷歌翻译
识别不太突出的功能是模型压缩的键。但是,它尚未在革命性的关注机制中进行调查。在这项工作中,我们提出了一种基于新的标准化的注意力模块(NAM),其抑制了不太突出的重量。因此,它将重量稀疏性罚款施加到注意模块,从而使它们更加计算有效,同时保持类似的性能。与Reset和MobileNet上的三种其他关注机制的比较表明我们的方法能够更高的准确性。本文的代码可以在https://github.com/christian -lyc/nam公开访问。
translated by 谷歌翻译
现有的胃癌诊断深层学习方法,常用卷积神经网络。最近,视觉变压器由于其性能和效率而引起了极大的关注,但其应用主要在计算机视野领域。本文提出了一种用于Gashis变压器的多尺度视觉变压器模型,用于胃组织病理学图像分类(GHIC),其使微观胃图像自动分类为异常和正常情况。 GASHIS-COMPURANCER模型由两个关键模块组成:全球信息模块和局部信息模块有效提取组织病理特征。在我们的实验中,具有280个异常和正常图像的公共血毒素和曙红(H&E)染色的胃组织病理学数据集分为训练,验证和测试组,比率为1:1:2胃组织病理学数据集测试组精度,召回,F1分数和准确性分别为98.0%,100.0%,96.0%和98.0%。此外,进行了关键的研究以评估Gashis变压器的稳健性,其中添加了10个不同的噪声,包括四种对抗性攻击和六种传统图像噪声。此外,执行临床上有意义的研究以测试Gashis变压器的胃肠癌鉴定性能,具有620个异常图像,精度达到96.8%。最后,进行比较研究以测试在淋巴瘤图像数据集和乳腺癌数据集上的H&E和免疫组织化学染色图像的概括性,产生可比的F1分数(85.6%和82.8%)和精度(83.9%和89.4%) , 分别。总之,Gashistransformer演示了高分类性能,并在GHIC任务中显示出其显着潜力。
translated by 谷歌翻译
We propose Hierarchical ProtoPNet: an interpretable network that explains its reasoning process by considering the hierarchical relationship between classes. Different from previous methods that explain their reasoning process by dissecting the input image and finding the prototypical parts responsible for the classification, we propose to explain the reasoning process for video action classification by dissecting the input video frames on multiple levels of the class hierarchy. The explanations leverage the hierarchy to deal with uncertainty, akin to human reasoning: When we observe water and human activity, but no definitive action it can be recognized as the water sports parent class. Only after observing a person swimming can we definitively refine it to the swimming action. Experiments on ActivityNet and UCF-101 show performance improvements while providing multi-level explanations.
translated by 谷歌翻译
Sparse principal component analysis (SPCA) has been widely used for dimensionality reduction and feature extraction in high-dimensional data analysis. Despite there are many methodological and theoretical developments in the past two decades, the theoretical guarantees of the popular SPCA algorithm proposed by Zou, Hastie & Tibshirani (2006) based on the elastic net are still unknown. We aim to close this important theoretical gap in this paper. We first revisit the SPCA algorithm of Zou et al. (2006) and present our implementation. Also, we study a computationally more efficient variant of the SPCA algorithm in Zou et al. (2006) that can be considered as the limiting case of SPCA. We provide the guarantees of convergence to a stationary point for both algorithms. We prove that, under a sparse spiked covariance model, both algorithms can recover the principal subspace consistently under mild regularity conditions. We show that their estimation error bounds match the best available bounds of existing works or the minimax rates up to some logarithmic factors. Moreover, we demonstrate the numerical performance of both algorithms in simulation studies.
translated by 谷歌翻译
Long-term non-prehensile planar manipulation is a challenging task for robot planning and feedback control. It is characterized by underactuation, hybrid control, and contact uncertainty. One main difficulty is to determine contact points and directions, which involves joint logic and geometrical reasoning in the modes of the dynamics model. To tackle this issue, we propose a demonstration-guided hierarchical optimization framework to achieve offline task and motion planning (TAMP). Our work extends the formulation of the dynamics model of the pusher-slider system to include separation mode with face switching cases, and solves a warm-started TAMP problem by exploiting human demonstrations. We show that our approach can cope well with the local minima problems currently present in the state-of-the-art solvers and determine a valid solution to the task. We validate our results in simulation and demonstrate its applicability on a pusher-slider system with real Franka Emika robot in the presence of external disturbances.
translated by 谷歌翻译
Human modeling and relighting are two fundamental problems in computer vision and graphics, where high-quality datasets can largely facilitate related research. However, most existing human datasets only provide multi-view human images captured under the same illumination. Although valuable for modeling tasks, they are not readily used in relighting problems. To promote research in both fields, in this paper, we present UltraStage, a new 3D human dataset that contains more than 2K high-quality human assets captured under both multi-view and multi-illumination settings. Specifically, for each example, we provide 32 surrounding views illuminated with one white light and two gradient illuminations. In addition to regular multi-view images, gradient illuminations help recover detailed surface normal and spatially-varying material maps, enabling various relighting applications. Inspired by recent advances in neural representation, we further interpret each example into a neural human asset which allows novel view synthesis under arbitrary lighting conditions. We show our neural human assets can achieve extremely high capture performance and are capable of representing fine details such as facial wrinkles and cloth folds. We also validate UltraStage in single image relighting tasks, training neural networks with virtual relighted data from neural assets and demonstrating realistic rendering improvements over prior arts. UltraStage will be publicly available to the community to stimulate significant future developments in various human modeling and rendering tasks.
translated by 谷歌翻译