基于人工神经网络(基于ANN)的损耗压缩机最近在多个来源获得了惊人的结果。它们的成功可能归因于在高维环境空间中识别低维歧管的结构的能力。实际上,先前的工作表明,基于ANN的压缩机可以实现某些此类来源的最佳熵距离曲线。相比之下,我们确定了具有圆形结构的两个低维歧管的最佳熵差异权,并表明基于最新的ANN压缩机无法最佳地压缩它们。
translated by 谷歌翻译
The number of international benchmarking competitions is steadily increasing in various fields of machine learning (ML) research and practice. So far, however, little is known about the common practice as well as bottlenecks faced by the community in tackling the research questions posed. To shed light on the status quo of algorithm development in the specific field of biomedical imaging analysis, we designed an international survey that was issued to all participants of challenges conducted in conjunction with the IEEE ISBI 2021 and MICCAI 2021 conferences (80 competitions in total). The survey covered participants' expertise and working environments, their chosen strategies, as well as algorithm characteristics. A median of 72% challenge participants took part in the survey. According to our results, knowledge exchange was the primary incentive (70%) for participation, while the reception of prize money played only a minor role (16%). While a median of 80 working hours was spent on method development, a large portion of participants stated that they did not have enough time for method development (32%). 25% perceived the infrastructure to be a bottleneck. Overall, 94% of all solutions were deep learning-based. Of these, 84% were based on standard architectures. 43% of the respondents reported that the data samples (e.g., images) were too large to be processed at once. This was most commonly addressed by patch-based training (69%), downsampling (37%), and solving 3D analysis tasks as a series of 2D tasks. K-fold cross-validation on the training set was performed by only 37% of the participants and only 50% of the participants performed ensembling based on multiple identical models (61%) or heterogeneous models (39%). 48% of the respondents applied postprocessing steps.
translated by 谷歌翻译
Segmentation of regions of interest (ROIs) for identifying abnormalities is a leading problem in medical imaging. Using Machine Learning (ML) for this problem generally requires manually annotated ground-truth segmentations, demanding extensive time and resources from radiologists. This work presents a novel weakly supervised approach that utilizes binary image-level labels, which are much simpler to acquire, to effectively segment anomalies in medical Magnetic Resonance (MR) images without ground truth annotations. We train a binary classifier using these labels and use it to derive seeds indicating regions likely and unlikely to contain tumors. These seeds are used to train a generative adversarial network (GAN) that converts cancerous images to healthy variants, which are then used in conjunction with the seeds to train a ML model that generates effective segmentations. This method produces segmentations that achieve Dice coefficients of 0.7903, 0.7868, and 0.7712 on the MICCAI Brain Tumor Segmentation (BraTS) 2020 dataset for the training, validation, and test cohorts respectively. We also propose a weakly supervised means of filtering the segmentations, removing a small subset of poorer segmentations to acquire a large subset of high quality segmentations. The proposed filtering further improves the Dice coefficients to up to 0.8374, 0.8232, and 0.8136 for training, validation, and test, respectively.
translated by 谷歌翻译
人工智能(AI)技术在医学成像数据中的应用带来了令人鼓舞的结果。作为医学成像中AI管道的重要分支,放射线学面临两个主要挑战,即可重复性和可访问性。在这项工作中,我们介绍了开放放射线学,一组放射素学数据集以及一条全面的放射线学管道,该管道研究了放射素学的效果,具有提取设置,例如萃取设置,例如BINWIDTH和图像归一化对放射线学结果表现可重复性的可重复性。为了使放射科学研究更容易访问和可重现,我们为放射系统数据提供了建筑机器学习(ML)模型的指南,引入开放式放射线学,开放源代码放射线数据集的不断发展的集合,并为数据集发布基线模型。
translated by 谷歌翻译
流体流动在自然和工程学科中是无所不在的。由于多种时空尺度上的非线性相互作用,可靠的流体计算是一种持久的挑战。可压缩的Navier-Stokes方程管理可压缩流动,并允许复杂的现象,如湍流和冲击。尽管硬件和软件具有巨大进展,但捕获流体流量的最小长度仍然引入了现实生活应用的禁止计算成本。我们目前目前目睹了对机器学习支持的数字方案设计的范式转变,作为解决上述问题的手段。虽然事先工作已经探索了用于单位或二维不可压缩的流体流量的可微分算法,但是我们向使用高阶状态的数值方法提供了一种用于计算可压缩流体流动的完全可微分的三维框架。首先,我们通过计算经典的二维和三维测试用例来展示我们的解决者的效率,包括强烈的冲击和过渡到湍流。其次,更重要的是,我们的框架允许结束到最终的优化来改进计算流体动力学算法内的现有数值方案。特别是,我们正在使用神经网络来替代传统的数控函数。
translated by 谷歌翻译
脑肿瘤分割是肿瘤体积分析和AI算法的关键任务。然而,它是一种耗时的过程,需要神经加理学专业知识。虽然已经进行了广泛的研究,其专注于在成人人群中优化脑肿瘤细分,但对AI引导的儿科肿瘤细分的研究是稀缺的。此外,儿科和成人脑肿瘤的MRI信号特征不同,需要开发专为儿科脑肿瘤设计的分段算法。我们开发了一种在医院医院(Toronto,Ontario,加拿大)的磁共振成像(PLGGS)的磁共振成像(MRI)培训的分割模型。所提出的模型通过将肿瘤的遗传改变分类器添加为主网络的辅助任务来利用深度多任务学习(DMTL),最终提高分段结果的准确性。
translated by 谷歌翻译
背景:虽然卷积神经网络(CNN)实现了检测基于磁共振成像(MRI)扫描的阿尔茨海默病(AD)痴呆的高诊断准确性,但它们尚未应用于临床常规。这是一个重要原因是缺乏模型可理解性。最近开发的用于导出CNN相关性图的可视化方法可能有助于填补这种差距。我们调查了具有更高准确性的模型还依赖于先前知识预定义的判别脑区域。方法:我们培训了CNN,用于检测痴呆症和Amnestic认知障碍(MCI)患者的N = 663 T1加权MRI扫描的AD,并通过交叉验证和三个独立样本验证模型的准确性= 1655例。我们评估了相关评分和海马体积的关联,以验证这种方法的临床效用。为了提高模型可理解性,我们实现了3D CNN相关性图的交互式可视化。结果:跨三个独立数据集,组分离表现出广告痴呆症与控制的高精度(AUC $ \ GEQUQ $ 0.92)和MCI与控制的中等精度(AUC $ \约0.75美元)。相关性图表明海马萎缩被认为是广告检测的最具信息性因素,其其他皮质和皮质区域中的萎缩额外贡献。海马内的相关评分与海马体积高度相关(Pearson的r $ \大约$ -0.86,p <0.001)。结论:相关性地图突出了我们假设先验的地区的萎缩。这加强了CNN模型的可理解性,这些模型基于扫描和诊断标签以纯粹的数据驱动方式培训。
translated by 谷歌翻译
We introduce a new tool for stochastic convex optimization (SCO): a Reweighted Stochastic Query (ReSQue) estimator for the gradient of a function convolved with a (Gaussian) probability density. Combining ReSQue with recent advances in ball oracle acceleration [CJJJLST20, ACJJS21], we develop algorithms achieving state-of-the-art complexities for SCO in parallel and private settings. For a SCO objective constrained to the unit ball in $\mathbb{R}^d$, we obtain the following results (up to polylogarithmic factors). We give a parallel algorithm obtaining optimization error $\epsilon_{\text{opt}}$ with $d^{1/3}\epsilon_{\text{opt}}^{-2/3}$ gradient oracle query depth and $d^{1/3}\epsilon_{\text{opt}}^{-2/3} + \epsilon_{\text{opt}}^{-2}$ gradient queries in total, assuming access to a bounded-variance stochastic gradient estimator. For $\epsilon_{\text{opt}} \in [d^{-1}, d^{-1/4}]$, our algorithm matches the state-of-the-art oracle depth of [BJLLS19] while maintaining the optimal total work of stochastic gradient descent. We give an $(\epsilon_{\text{dp}}, \delta)$-differentially private algorithm which, given $n$ samples of Lipschitz loss functions, obtains near-optimal optimization error and makes $\min(n, n^2\epsilon_{\text{dp}}^2 d^{-1}) + \min(n^{4/3}\epsilon_{\text{dp}}^{1/3}, (nd)^{2/3}\epsilon_{\text{dp}}^{-1})$ queries to the gradients of these functions. In the regime $d \le n \epsilon_{\text{dp}}^{2}$, where privacy comes at no cost in terms of the optimal loss up to constants, our algorithm uses $n + (nd)^{2/3}\epsilon_{\text{dp}}^{-1}$ queries and improves recent advancements of [KLL21, AFKT21]. In the moderately low-dimensional setting $d \le \sqrt n \epsilon_{\text{dp}}^{3/2}$, our query complexity is near-linear.
translated by 谷歌翻译
Machine learning models are typically evaluated by computing similarity with reference annotations and trained by maximizing similarity with such. Especially in the bio-medical domain, annotations are subjective and suffer from low inter- and intra-rater reliability. Since annotations only reflect the annotation entity's interpretation of the real world, this can lead to sub-optimal predictions even though the model achieves high similarity scores. Here, the theoretical concept of Peak Ground Truth (PGT) is introduced. PGT marks the point beyond which an increase in similarity with the reference annotation stops translating to better Real World Model Performance (RWMP). Additionally, a quantitative technique to approximate PGT by computing inter- and intra-rater reliability is proposed. Finally, three categories of PGT-aware strategies to evaluate and improve model performance are reviewed.
translated by 谷歌翻译
Three main points: 1. Data Science (DS) will be increasingly important to heliophysics; 2. Methods of heliophysics science discovery will continually evolve, requiring the use of learning technologies [e.g., machine learning (ML)] that are applied rigorously and that are capable of supporting discovery; and 3. To grow with the pace of data, technology, and workforce changes, heliophysics requires a new approach to the representation of knowledge.
translated by 谷歌翻译