Advances in reinforcement learning (RL) often rely on massive compute resources and remain notoriously sample inefficient. In contrast, the human brain is able to efficiently learn effective control strategies using limited resources. This raises the question whether insights from neuroscience can be used to improve current RL methods. Predictive processing is a popular theoretical framework which maintains that the human brain is actively seeking to minimize surprise. We show that recurrent neural networks which predict their own sensory states can be leveraged to minimise surprise, yielding substantial gains in cumulative reward. Specifically, we present the Predictive Processing Proximal Policy Optimization (P4O) agent; an actor-critic reinforcement learning agent that applies predictive processing to a recurrent variant of the PPO algorithm by integrating a world model in its hidden state. P4O significantly outperforms a baseline recurrent variant of the PPO algorithm on multiple Atari games using a single GPU. It also outperforms other state-of-the-art agents given the same wall-clock time and exceeds human gamer performance on multiple games including Seaquest, which is a particularly challenging environment in the Atari domain. Altogether, our work underscores how insights from the field of neuroscience may support the development of more capable and efficient artificial agents.
translated by 谷歌翻译
值得怀疑的是,动物具有其四肢的完美逆模型(例如,必须在每个关节上应用什么肌肉收缩才能到达太空中的特定位置)。但是,在机器人控制中,将ARM的最终效应器移至目标位置或沿目标轨迹需要准确的前进和逆模型。在这里,我们证明,通过从交互中学习过渡(正向)模型,我们可以使用它来推动摊销策略的学习。因此,我们重新审视了与深度主动推理框架有关的策略优化,并描述了一个模块化神经网络体系结构,该模块化神经网络体系结构同时从预测错误中学习了系统动力学以及生成合适的连续控制命令以达到所需参考位置的随机策略。我们通过将模型与线性二次调节器的基线进行比较来评估该模型,并以其他步骤来朝着类似人类的运动控制方向进行比较。
translated by 谷歌翻译
准实验研究设计,如回归不连续性和中断的时间序列,允许在缺乏随机对照试验的情况下进行因果推断,以额外的假设。在本文中,我们为使用贝叶斯模型比较和高斯进程回归提供了一种基于不连续性的设计的框架,我们将其称为“贝叶斯非参数不连续性设计”,或短路。 BNDD在这种设计的大多数实现中解决了两个主要的缺点:由于隐式调节对所谓的效果而言,由于依赖过于简单的回归模型,模型误操作。通过适当的高斯过程协方差函数,我们的方法可以检测任何订单的不连续性,以及频谱特征。我们展示了BNDD在模拟中的使用情况,并应用了框架,以确定历史悠久的政治立场的效果,涉嫌历史幻影边境在荷兰对荷兰投票行为的影响,以及昆达里尼瑜伽冥想对心率。
translated by 谷歌翻译
The number of international benchmarking competitions is steadily increasing in various fields of machine learning (ML) research and practice. So far, however, little is known about the common practice as well as bottlenecks faced by the community in tackling the research questions posed. To shed light on the status quo of algorithm development in the specific field of biomedical imaging analysis, we designed an international survey that was issued to all participants of challenges conducted in conjunction with the IEEE ISBI 2021 and MICCAI 2021 conferences (80 competitions in total). The survey covered participants' expertise and working environments, their chosen strategies, as well as algorithm characteristics. A median of 72% challenge participants took part in the survey. According to our results, knowledge exchange was the primary incentive (70%) for participation, while the reception of prize money played only a minor role (16%). While a median of 80 working hours was spent on method development, a large portion of participants stated that they did not have enough time for method development (32%). 25% perceived the infrastructure to be a bottleneck. Overall, 94% of all solutions were deep learning-based. Of these, 84% were based on standard architectures. 43% of the respondents reported that the data samples (e.g., images) were too large to be processed at once. This was most commonly addressed by patch-based training (69%), downsampling (37%), and solving 3D analysis tasks as a series of 2D tasks. K-fold cross-validation on the training set was performed by only 37% of the participants and only 50% of the participants performed ensembling based on multiple identical models (61%) or heterogeneous models (39%). 48% of the respondents applied postprocessing steps.
translated by 谷歌翻译
Front-door adjustment is a classic technique to estimate causal effects from a specified directed acyclic graph (DAG) and observed data. The advantage of this approach is that it uses observed mediators to identify causal effects, which is possible even in the presence of unobserved confounding. While the statistical properties of the front-door estimation are quite well understood, its algorithmic aspects remained unexplored for a long time. Recently, Jeong, Tian, and Barenboim [NeurIPS 2022] have presented the first polynomial-time algorithm for finding sets satisfying the front-door criterion in a given DAG, with an $O(n^3(n+m))$ run time, where $n$ denotes the number of variables and $m$ the number of edges of the graph. In our work, we give the first linear-time, i.e. $O(n+m)$, algorithm for this task, which thus reaches the asymptotically optimal time complexity, as the size of the input is $\Omega(n+m)$. We also provide an algorithm to enumerate all front-door adjustment sets in a given DAG with delay $O(n(n + m))$. These results improve the algorithms by Jeong et al. [2022] for the two tasks by a factor of $n^3$, respectively.
translated by 谷歌翻译
未知的非线性动力学通常会限制前馈控制的跟踪性能。本文的目的是开发一个可以使用通用函数近似器来补偿这些未知非线性动力学的前馈控制框架。前馈控制器被参数化为基于物理模型和神经网络的平行组合,在该组合中,两者都共享相同的线性自回旋(AR)动力学。该参数化允许通过Sanathanan-Koerner(SK)迭代进行有效的输出误差优化。在每个Sk-itteration中,神经网络的输出在基于物理模型的子空间中通过基于正交投影的正则化受到惩罚,从而使神经网络仅捕获未建模的动力学,从而产生可解释的模型。
translated by 谷歌翻译
我们提出了GAAF(一种广义自动解剖器查找器),用于鉴定3D CT扫描中的通用解剖位置。GAAF是端到端管道,具有专用模块用于数据预处理,模型培训和推理。GAAF以核心使用自定义卷积神经网络(CNN)。CNN型号很小,轻巧,可以调整以适合特定应用。到目前为止,GAAF框架已经在头部和颈部进行了测试,并且能够找到解剖位置,例如脑干的质量中心。GAAF在开放式数据集中进行了评估,并且能够准确稳健地定位性能。我们所有的代码都是开源的,可在https://github.com/rrr-uom-projects/gaaf上找到。
translated by 谷歌翻译
使用卷积神经网络(CNNS)自动分割CT扫描中的器官 - AT风险(OARS),正在放疗工作流中。但是,这些细分仍需要在临床使用前进行临床医生的手动编辑和批准,这可能很耗时。这项工作的目的是开发一种工具,以自动识别3D OAR细分中的错误,而无需基础真相。我们的工具使用了结合CNN和图神经网络(GNN)的新型体系结构来利用分割的外观和形状。使用合成生成的腮腺分割数据集并使用逼真的轮廓错误的数据集对所提出的模型进行训练。通过消融测试评估我们的模型的有效性,评估了体系结构不同部分的功效,以及从无监督的借口任务中使用转移学习。我们最佳性能模型预测了腮腺上的错误,内部和外部错误的精度分别为85.0%和89.7%,召回66.5%和68.6%。该离线质量检查工具可以在临床途径中使用,有可能减少临床医生通过检测需要注意的区域来纠正轮廓的时间。我们所有的代码均可在https://github.com/rrr-uom-projects/contour_auto_qatool上公开获得。
translated by 谷歌翻译
深度学习模型在识别医学图像中的发现方面表现出了极大的有效性。但是,他们无法处理不断变化的临床环境,从而带来了来自不同来源的新注释的医学数据。为了利用传入的数据流,这些模型将在很大程度上受益于从新样本中依次学习,而不会忘记先前获得的知识。在本文中,我们通过应用现有的最新持续学习方法介绍了MedMnist收集中连续疾病分类的基准。特别是,我们考虑了三种连续的学习方案,即任务和班级增量学习以及新定义的跨域增量学习。疾病的任务和班级增量学习解决了对新样本进行分类的问题,而无需重新从头开始模型,而跨域增量学习解决了处理源自不同机构的数据集的问题,同时保留了先前获得的知识。我们对表现进行彻底的分析,并研究如何在这种情况下表现出灾难性遗忘的持续学习挑战。令人鼓舞的结果表明,持续学习具有推进疾病分类并为临床环境产生更强大,更有效的学习框架的主要潜力。将公开提供完整基准测试的代码存储库,数据分区和基线结果。
translated by 谷歌翻译
机器学习算法必须能够有效地应对大量数据集。因此,他们必须在任何现代系统上进行良好的扩展,并能够利用独立于供应商的加速器的计算能力。在监督学习领域,支持向量机(SVM)被广泛使用。但是,即使是现代化和优化的实现,例如LIBSVM或ThunderSVM对于尖端硬件的大型非平凡的密集数据集也不能很好地扩展:大多数SVM实现基于顺序最小优化,这是一种优化的固有顺序算法。因此,它们不适合高度平行的GPU。此外,我们不知道支持不同供应商的CPU和GPU的性能便携式实现。我们已经开发了PLSSVM库来解决这两个问题。首先,我们将SVM的配方作为最小二乘问题。然后训练SVM沸腾以求解已知高度平行算法的线性方程系统。其次,我们提供了一个独立但高效的实现:PLSSVM使用不同的可互换后端 - openmp,cuda,opencl,sycl-支持来自多个GPU的NVIDIA,AMD或INTEL等各种供应商的现代硬件。 PLSSVM可以用作LIBSVM的倒入替换。与LIBSVM相比,与ThunderSVM相比,我们观察到高达10的CPU和GPU的加速度。我们的实施量表在多核CPU上缩放,并在多达256个CPU线程和多个GPU上平行加速为74.7,在四个GPU上的并行加速为3.71。代码,实用程序脚本和文档都可以在GitHub上获得:https://github.com/sc-sgs/plssvm。
translated by 谷歌翻译