由不同形状和非线性形状变化引起的机器官的大变形,对医学图像配准产生了重大挑战。传统的注册方法需要通过特定变形模型迭代地优化目标函数以及细致的参数调谐,但在具有大变形的图像中具有有限的能力。虽然基于深度学习的方法可以从输入图像到它们各自的变形字段中的复杂映射,但它是基于回归的,并且容易被卡在局部最小值,特别是当涉及大变形时。为此,我们呈现随机策划者 - 演员 - 评论家(SPAC),这是一种新的加强学习框架,可以执行逐步登记。关键概念通过每次步骤连续地翘曲运动图像,以最终与固定图像对齐。考虑到在传统的强化学习(RL)框架中处理高维连续动作和状态空间有挑战性,我们向标准演员 - 评论家模型引入了一个新的概念“计划”,这是低维度,可以促进演员生成易于高维行动。整个框架基于无监督的培训,并以端到端的方式运行。我们在几个2D和3D医学图像数据集上评估我们的方法,其中一些包含大变形。我们的经验结果强调了我们的工作实现了一致,显着的收益和优于最先进的方法。
translated by 谷歌翻译
训练无模型的深度加强学习模型来解决图像到图像转换是困难的,因为它涉及高维连续状态和动作空间。在本文中,我们借鉴了最近的最大熵增强学习框架成功的灵感来设计用于挑战连续控制问题,在包括图像表示,产生和控制的高维连续空间上开发随机策略。这种方法的核心是随机演员 - 执行程序 - 批评者 - 评论家(SAEC),这是一个违法的演员 - 评论家模型,具有额外的excator来生成现实图像。具体地,该actor通过随机潜行动作侧重于高级表示和控制策略,以及明确地指示执行器生成用于操纵状态的低级动作。关于若干图像到图像转换任务的实验已经证明了在面对高维连续空间问题时所提出的SAEC的有效性和稳健性。
translated by 谷歌翻译
动机:医学图像分析涉及帮助医师对病变或解剖结构进行定性和定量分析的任务,从而显着提高诊断和预后的准确性和可靠性。传统上,这些任务由医生或医学物理学家完成,并带来两个主要问题:(i)低效率; (ii)受个人经验的偏见。在过去的十年中,已经应用了许多机器学习方法来加速和自动化图像分析过程。与受监督和无监督的学习模型的大量部署相比,在医学图像分析中使用强化学习的尝试很少。这篇评论文章可以作为相关研究的垫脚石。意义:从我们的观察结果来看,尽管近年来增强学习逐渐增强了动力,但医学分析领域的许多研究人员发现很难理解和部署在诊所中。一个原因是缺乏组织良好的评论文章,针对缺乏专业计算机科学背景的读者。本文可能没有提供医学图像分析中所有强化学习模型的全面列表,而是可以帮助读者学习如何制定和解决他们的医学图像分析研究作为强化学习问题。方法和结果:我们从Google Scholar和PubMed中选择了已发表的文章。考虑到相关文章的稀缺性,我们还提供了一些出色的最新预印本。根据图像分析任务的类型对论文进行仔细审查和分类。我们首先回顾了强化学习的基本概念和流行模型。然后,我们探讨了增强学习模型在具有里程碑意义的检测中的应用。最后,我们通过讨论审查的强化学习方法的局限性和可能的​​改进来结束这篇文章。
translated by 谷歌翻译
Model-free deep reinforcement learning (RL) algorithms have been demonstrated on a range of challenging decision making and control tasks. However, these methods typically suffer from two major challenges: very high sample complexity and brittle convergence properties, which necessitate meticulous hyperparameter tuning. Both of these challenges severely limit the applicability of such methods to complex, real-world domains. In this paper, we propose soft actor-critic, an offpolicy actor-critic deep RL algorithm based on the maximum entropy reinforcement learning framework. In this framework, the actor aims to maximize expected reward while also maximizing entropy. That is, to succeed at the task while acting as randomly as possible. Prior deep RL methods based on this framework have been formulated as Q-learning methods. By combining off-policy updates with a stable stochastic actor-critic formulation, our method achieves state-of-the-art performance on a range of continuous control benchmark tasks, outperforming prior on-policy and off-policy methods. Furthermore, we demonstrate that, in contrast to other off-policy algorithms, our approach is very stable, achieving very similar performance across different random seeds.
translated by 谷歌翻译
Deformable image registration, i.e., the task of aligning multiple images into one coordinate system by non-linear transformation, serves as an essential preprocessing step for neuroimaging data. Recent research on deformable image registration is mainly focused on improving the registration accuracy using multi-stage alignment methods, where the source image is repeatedly deformed in stages by a same neural network until it is well-aligned with the target image. Conventional methods for multi-stage registration can often blur the source image as the pixel/voxel values are repeatedly interpolated from the image generated by the previous stage. However, maintaining image quality such as sharpness during image registration is crucial to medical data analysis. In this paper, we study the problem of anti-blur deformable image registration and propose a novel solution, called Anti-Blur Network (ABN), for multi-stage image registration. Specifically, we use a pair of short-term registration and long-term memory networks to learn the nonlinear deformations at each stage, where the short-term registration network learns how to improve the registration accuracy incrementally and the long-term memory network combines all the previous deformations to allow an interpolation to perform on the raw image directly and preserve image sharpness. Extensive experiments on both natural and medical image datasets demonstrated that ABN can accurately register images while preserving their sharpness. Our code and data can be found at https://github.com/anonymous3214/ABN
translated by 谷歌翻译
软演员 - 评论家(SAC)是最先进的偏离策略强化学习(RL)算法之一,其在基于最大熵的RL框架内。 SAC被证明在具有良好稳定性和稳健性的持续控制任务的列表中表现得非常好。 SAC了解一个随机高斯政策,可以最大限度地提高预期奖励和政策熵之间的权衡。要更新策略,SAC可最大限度地减少当前策略密度与软值函数密度之间的kl分歧。然后用于获得这种分歧的近似梯度的回报。在本文中,我们提出了跨熵策略优化(SAC-CEPO)的软演员 - 评论家,它使用跨熵方法(CEM)来优化SAC的政策网络。初始思想是使用CEM来迭代地对软价函数密度的最接近的分布进行采样,并使用结果分布作为更新策略网络的目标。为了降低计算复杂性,我们还介绍了一个解耦的策略结构,该策略结构将高斯策略解耦为一个策略,了解了学习均值的均值和另一个策略,以便只有CEM训练平均政策。我们表明,这种解耦的政策结构确实会聚到最佳,我们还通过实验证明SAC-CEPO实现对原始囊的竞争性能。
translated by 谷歌翻译
We present VoxelMorph, a fast learning-based framework for deformable, pairwise medical image registration. Traditional registration methods optimize an objective function for each pair of images, which can be time-consuming for large datasets or rich deformation models. In contrast to this approach, and building on recent learning-based methods, we formulate registration as a function that maps an input image pair to a deformation field that aligns these images. We parameterize the function via a convolutional neural network (CNN), and optimize the parameters of the neural network on a set of images. Given a new pair of scans, VoxelMorph rapidly computes a deformation field by directly evaluating the function. In this work, we explore two different training strategies. In the first (unsupervised) setting, we train the model to maximize standard image matching objective functions that are based on the image intensities. In the second setting, we leverage auxiliary segmentations available in the training data. We demonstrate that the unsupervised model's accuracy is comparable to state-of-the-art methods, while operating orders of magnitude faster. We also show that VoxelMorph trained with auxiliary data improves registration accuracy at test time, and evaluate the effect of training set size on registration. Our method promises to speed up medical image analysis and processing pipelines, while facilitating novel directions in learning-based registration and its applications. Our code is freely available at http://voxelmorph.csail.mit.edu.
translated by 谷歌翻译
采用合理的策略是具有挑战性的,但对于智能代理商的智能代理人至关重要,其资源有限,在危险,非结构化和动态环境中工作,以改善系统实用性,降低整体成本并增加任务成功概率。深度强化学习(DRL)帮助组织代理的行为和基于其状态的行为,并代表复杂的策略(行动的组成)。本文提出了一种基于贝叶斯链条的新型分层策略分解方法,将复杂的政策分为几个简单的子手段,并将其作为贝叶斯战略网络(BSN)组织。我们将这种方法整合到最先进的DRL方法中,软演奏者 - 批评者(SAC),并通过组织几个子主管作为联合政策来构建相应的贝叶斯软演奏者(BSAC)模型。我们将建议的BSAC方法与标准连续控制基准(Hopper-V2,Walker2D-V2和Humanoid-V2)在SAC和其他最先进的方法(例如TD3,DDPG和PPO)中进行比较 - Mujoco与Openai健身房环境。结果表明,BSAC方法的有希望的潜力可显着提高训练效率。可以从https://github.com/herolab-uga/bsac访问BSAC的开源代码。
translated by 谷歌翻译
How to learn an effective reinforcement learning-based model for control tasks from high-level visual observations is a practical and challenging problem. A key to solving this problem is to learn low-dimensional state representations from observations, from which an effective policy can be learned. In order to boost the learning of state encoding, recent works are focused on capturing behavioral similarities between state representations or applying data augmentation on visual observations. In this paper, we propose a novel meta-learner-based framework for representation learning regarding behavioral similarities for reinforcement learning. Specifically, our framework encodes the high-dimensional observations into two decomposed embeddings regarding reward and dynamics in a Markov Decision Process (MDP). A pair of meta-learners are developed, one of which quantifies the reward similarity and the other quantifies dynamics similarity over the correspondingly decomposed embeddings. The meta-learners are self-learned to update the state embeddings by approximating two disjoint terms in on-policy bisimulation metric. To incorporate the reward and dynamics terms, we further develop a strategy to adaptively balance their impacts based on different tasks or environments. We empirically demonstrate that our proposed framework outperforms state-of-the-art baselines on several benchmarks, including conventional DM Control Suite, Distracting DM Control Suite and a self-driving task CARLA.
translated by 谷歌翻译
Brain extraction and registration are important preprocessing steps in neuroimaging data analysis, where the goal is to extract the brain regions from MRI scans (i.e., extraction step) and align them with a target brain image (i.e., registration step). Conventional research mainly focuses on developing methods for the extraction and registration tasks separately under supervised settings. The performance of these methods highly depends on the amount of training samples and visual inspections performed by experts for error correction. However, in many medical studies, collecting voxel-level labels and conducting manual quality control in high-dimensional neuroimages (e.g., 3D MRI) are very expensive and time-consuming. Moreover, brain extraction and registration are highly related tasks in neuroimaging data and should be solved collectively. In this paper, we study the problem of unsupervised collective extraction and registration in neuroimaging data. We propose a unified end-to-end framework, called ERNet (Extraction-Registration Network), to jointly optimize the extraction and registration tasks, allowing feedback between them. Specifically, we use a pair of multi-stage extraction and registration modules to learn the extraction mask and transformation, where the extraction network improves the extraction accuracy incrementally and the registration network successively warps the extracted image until it is well-aligned with the target image. Experiment results on real-world datasets show that our proposed method can effectively improve the performance on extraction and registration tasks in neuroimaging data. Our code and data can be found at https://github.com/ERNetERNet/ERNet
translated by 谷歌翻译
Reinforcement learning (RL) gained considerable attention by creating decision-making agents that maximize rewards received from fully observable environments. However, many real-world problems are partially or noisily observable by nature, where agents do not receive the true and complete state of the environment. Such problems are formulated as partially observable Markov decision processes (POMDPs). Some studies applied RL to POMDPs by recalling previous decisions and observations or inferring the true state of the environment from received observations. Nevertheless, aggregating observations and decisions over time is impractical for environments with high-dimensional continuous state and action spaces. Moreover, so-called inference-based RL approaches require large number of samples to perform well since agents eschew uncertainty in the inferred state for the decision-making. Active inference is a framework that is naturally formulated in POMDPs and directs agents to select decisions by minimising expected free energy (EFE). This supplies reward-maximising (exploitative) behaviour in RL, with an information-seeking (exploratory) behaviour. Despite this exploratory behaviour of active inference, its usage is limited to discrete state and action spaces due to the computational difficulty of the EFE. We propose a unified principle for joint information-seeking and reward maximization that clarifies a theoretical connection between active inference and RL, unifies active inference and RL, and overcomes their aforementioned limitations. Our findings are supported by strong theoretical analysis. The proposed framework's superior exploration property is also validated by experimental results on partial observable tasks with high-dimensional continuous state and action spaces. Moreover, the results show that our model solves reward-free problems, making task reward design optional.
translated by 谷歌翻译
图像注册广泛用于医学图像分析中,以提供两个图像之间的空间对应关系。最近提出了利用卷积神经网络(CNN)的基于学习的方法来解决图像注册问题。基于学习的方法往往比基于传统优化的方法快得多,但是从复杂的CNN方法中获得的准确性提高是适度的。在这里,我们介绍了一个新的基于深神经的图像注册框架,名为\ textbf {mirnf},该框架代表通过通过神经字段实现的连续函数的对应映射。 MIRNF输出的变形矢量或速度向量给定3D坐标为输入。为了确保映射是差异的,使用神经ODE求解器集成了MiRNF的速度矢量输出,以得出两个图像之间的对应关系。此外,我们提出了一个混合坐标采样器以及级联的体系结构,以实现高相似性映射性能和低距离变形场。我们对两个3D MR脑扫描数据集进行了实验,这表明我们提出的框架提供了最新的注册性能,同时保持了可比的优化时间。
translated by 谷歌翻译
可变形的图像注册对于许多医学图像分析是基础。准确图像注册的关键障碍在于图像外观变化,例如纹理,强度和噪声的变化。这些变化在医学图像中很明显,尤其是在经常使用注册的大脑图像中。最近,使用深神经网络的基于深度学习的注册方法(DLR)显示了计算效率,比基于传统优化的注册方法(ORS)快几个数量级。 DLR依靠一个全球优化的网络,该网络经过一组培训样本训练以实现更快的注册。但是,DLR倾向于无视ORS固有的目标对特异性优化,因此已经降低了对测试样品变化的适应性。这种限制对于注册出现较大的医学图像的限制是严重的,尤其是因为很少有现有的DLR明确考虑了外观的变化。在这项研究中,我们提出了一个外观调整网络(AAN),以增强DLR对外观变化的适应性。当我们集成到DLR中时,我们的AAN提供了外观转换,以减少注册过程中的外观变化。此外,我们提出了一个由解剖结构约束的损失函数,通过该函数,我们的AAN产生了解剖结构的转化。我们的AAN被目的设计为容易插入广泛的DLR中,并且可以以无监督和端到端的方式进行合作培训。我们用三个最先进的DLR评估了3D脑磁共振成像(MRI)的三个公共数据集(MRI)。结果表明,我们的AAN始终提高了现有的DLR,并且在注册精度上优于最先进的OR,同时向现有DLR增加了分数计算负载。
translated by 谷歌翻译
可变形的图像配准能够在一对图像之间实现快速准确的对准,因此在许多医学图像研究中起着重要作用。当前的深度学习(DL)基础的图像登记方法通过利用卷积神经网络直接从一个图像到另一个图像的空间变换,要求地面真相或相似度量。然而,这些方法仅使用全局相似性能量函数来评估一对图像的相似性,该图像忽略了图像内的感兴趣区域(ROI)的相似性。此外,基于DL的方法通常估计直接图像的全球空间转换,这永远不会注意图像内ROI的区域空间转换。在本文中,我们介绍了一种具有区域一致性约束的新型双流转换网络,其最大化了一对图像内的ROI的相似性,并同时估计全局和区域空间转换。四个公共3D MRI数据集的实验表明,与其他最先进的方法相比,该方法可实现准确性和泛化的最佳登记性能。
translated by 谷歌翻译
translated by 谷歌翻译
Hierarchical Reinforcement Learning (HRL) algorithms have been demonstrated to perform well on high-dimensional decision making and robotic control tasks. However, because they solely optimize for rewards, the agent tends to search the same space redundantly. This problem reduces the speed of learning and achieved reward. In this work, we present an Off-Policy HRL algorithm that maximizes entropy for efficient exploration. The algorithm learns a temporally abstracted low-level policy and is able to explore broadly through the addition of entropy to the high-level. The novelty of this work is the theoretical motivation of adding entropy to the RL objective in the HRL setting. We empirically show that the entropy can be added to both levels if the Kullback-Leibler (KL) divergence between consecutive updates of the low-level policy is sufficiently small. We performed an ablative study to analyze the effects of entropy on hierarchy, in which adding entropy to high-level emerged as the most desirable configuration. Furthermore, a higher temperature in the low-level leads to Q-value overestimation and increases the stochasticity of the environment that the high-level operates on, making learning more challenging. Our method, SHIRO, surpasses state-of-the-art performance on a range of simulated robotic control benchmark tasks and requires minimal tuning.
translated by 谷歌翻译
纵向形象注册是具有挑战性的,并且由于深学习,尚未受益于主要的性能改善。通过深映像的启发,本文介绍了不同利用的深层架构作为常规,以解决图像登记问题。我们提出了一种称为MIRRBA的特定主题可变形的登记方法,依赖于深的金字塔架构是限制变形场的现有参数模型。 MIRRBA不需要学习数据库,而是仅登记的图像,以便注册一对图像以优化网络参数并提供变形字段并提供变形字段。我们展示了深度架构的正规化力量,并呈现了新的元素,以了解架构在注册的深度学习方法中的作用。因此,要研究网络参数的影响,我们在110个转移乳腺癌全身宠物图像的私有数据集中运行了不同的架构配置,具有大脑,膀胱和转移性病变的手动分割。我们将其与传统的迭代登记方法进行比较和监督基于深度学习的模型。使用检测率和骰子分数评估全局和局部注册准确性,而使用雅加诺的决定因素评估登记现实。此外,我们计算了不同方法以消失的速率缩小消失的病变的能力。 MIRRBA显着改善了监督模型的器官和病变骰子分数。关于消失率,MIRRBA多倍于最佳性能的传统方法SYNCC得分。因此,我们的工作提出了一种替代方法来弥合常规和深度学习的方法之间的性能差距,并展示了深度架构的规律力量。
translated by 谷歌翻译
translated by 谷歌翻译
translated by 谷歌翻译
基于深神经网络(DNN)的不确定性(基于DNN)的图像登记算法在部署图像注册算法中起着至关重要的作用在面向研究的处理管道中。目前可用的基于DNN的图像登记算法的不确定性估计方法可能导致临床决策,这是由于对注册的不确定性的潜在不准确估计源是对注册潜在空间的假定参数分布的源。我们引入了NPBDREG,这是一种完全非参数贝叶斯框架,通过将ADAM优化器与随机梯度Langevin Dynamics(SGLD)相结合,以通过后验通过后抽样将基于DNN的可变形图像注册中的不确定性估计。因此,它具有提供与出现未分布数据的存在高度相关的不确定性估计值。我们使用四个公开可用数据库中的$ 390 $图像对(MGH10,CMUC12,ISBR18和LPBA40)在Brain MRI图像配准上证明了NPBDREG的附加价值,与基线概率VoxelMorph模型(PRVXM)相比。 NPBDREG显示了预测不确定性与分布数据($ r> 0.95 $ vs. $ r <0.5 $)的更好相关性,并且注册准确性提高了7.3%(骰子得分,$ 0.74 $ vs。 $ 0.69 $,$ p \ ll 0.01 $),注册平滑度提高了18%(变形字段中的折叠百分比为0.014 vs. 0.017,$ p \ ll 0.01 $)。最后,与基线PRVXM方法相比,NPBDREG证明了由混合结构噪声损坏的数据(骰子得分为$ 0.73 $,$ 0.69 $,$ p \ ll 0.01 $)的概括能力更好。
translated by 谷歌翻译