Partially observable Markov decision processes (POMDPs) provide a flexible representation for real-world decision and control problems. However, POMDPs are notoriously difficult to solve, especially when the state and observation spaces are continuous or hybrid, which is often the case for physical systems. While recent online sampling-based POMDP algorithms that plan with observation likelihood weighting have shown practical effectiveness, a general theory characterizing the approximation error of the particle filtering techniques that these algorithms use has not previously been proposed. Our main contribution is bounding the error between any POMDP and its corresponding finite sample particle belief MDP (PB-MDP) approximation. This fundamental bridge between PB-MDPs and POMDPs allows us to adapt any sampling-based MDP algorithm to a POMDP by solving the corresponding particle belief MDP, thereby extending the convergence guarantees of the MDP algorithm to the POMDP. Practically, this is implemented by using the particle filter belief transition model as the generative model for the MDP solver. While this requires access to the observation density model from the POMDP, it only increases the transition sampling complexity of the MDP solver by a factor of $\mathcal{O}(C)$, where $C$ is the number of particles. Thus, when combined with sparse sampling MDP algorithms, this approach can yield algorithms for POMDPs that have no direct theoretical dependence on the size of the state and observation spaces. In addition to our theoretical contribution, we perform five numerical experiments on benchmark POMDPs to demonstrate that a simple MDP algorithm adapted using PB-MDP approximation, Sparse-PFT, achieves performance competitive with other leading continuous observation POMDP solvers.
translated by 谷歌翻译
Science tests competing theories or models by evaluating the similarity of their predictions against observational experience. Thus, how we measure similarity fundamentally determines what we learn. In machine learning and scientific modeling, similarity metrics are used as objective functions. A classic example being mean squared error, which is the optimal measure of similarity when errors are normally distributed and independent and identically distributed (iid). In many cases, however, the error distribution is neither normal nor iid, so it is left to the scientist to determine an appropriate objective. Here, we review how information theory can guide that selection, then demonstrate the approach with a simple hydrologic model.
translated by 谷歌翻译
As Artificial and Robotic Systems are increasingly deployed and relied upon for real-world applications, it is important that they exhibit the ability to continually learn and adapt in dynamically-changing environments, becoming Lifelong Learning Machines. Continual/lifelong learning (LL) involves minimizing catastrophic forgetting of old tasks while maximizing a model's capability to learn new tasks. This paper addresses the challenging lifelong reinforcement learning (L2RL) setting. Pushing the state-of-the-art forward in L2RL and making L2RL useful for practical applications requires more than developing individual L2RL algorithms; it requires making progress at the systems-level, especially research into the non-trivial problem of how to integrate multiple L2RL algorithms into a common framework. In this paper, we introduce the Lifelong Reinforcement Learning Components Framework (L2RLCF), which standardizes L2RL systems and assimilates different continual learning components (each addressing different aspects of the lifelong learning problem) into a unified system. As an instantiation of L2RLCF, we develop a standard API allowing easy integration of novel lifelong learning components. We describe a case study that demonstrates how multiple independently-developed LL components can be integrated into a single realized system. We also introduce an evaluation environment in order to measure the effect of combining various system components. Our evaluation environment employs different LL scenarios (sequences of tasks) consisting of Starcraft-2 minigames and allows for the fair, comprehensive, and quantitative comparison of different combinations of components within a challenging common evaluation environment.
translated by 谷歌翻译
The SNMMI Artificial Intelligence (SNMMI-AI) Summit, organized by the SNMMI AI Task Force, took place in Bethesda, MD on March 21-22, 2022. It brought together various community members and stakeholders from academia, healthcare, industry, patient representatives, and government (NIH, FDA), and considered various key themes to envision and facilitate a bright future for routine, trustworthy use of AI in nuclear medicine. In what follows, essential issues, challenges, controversies and findings emphasized in the meeting are summarized.
translated by 谷歌翻译
ICECUBE是一种用于检测1 GEV和1 PEV之间大气和天体中微子的光学传感器的立方公斤阵列,该阵列已部署1.45 km至2.45 km的南极的冰盖表面以下1.45 km至2.45 km。来自ICE探测器的事件的分类和重建在ICeCube数据分析中起着核心作用。重建和分类事件是一个挑战,这是由于探测器的几何形状,不均匀的散射和冰中光的吸收,并且低于100 GEV的光,每个事件产生的信号光子数量相对较少。为了应对这一挑战,可以将ICECUBE事件表示为点云图形,并将图形神经网络(GNN)作为分类和重建方法。 GNN能够将中微子事件与宇宙射线背景区分开,对不同的中微子事件类型进行分类,并重建沉积的能量,方向和相互作用顶点。基于仿真,我们提供了1-100 GEV能量范围的比较与当前ICECUBE分析中使用的当前最新最大似然技术,包括已知系统不确定性的影响。对于中微子事件分类,与当前的IceCube方法相比,GNN以固定的假阳性速率(FPR)提高了信号效率的18%。另外,GNN在固定信号效率下将FPR的降低超过8(低于半百分比)。对于能源,方向和相互作用顶点的重建,与当前最大似然技术相比,分辨率平均提高了13%-20%。当在GPU上运行时,GNN能够以几乎是2.7 kHz的中位数ICECUBE触发速率的速率处理ICECUBE事件,这打开了在在线搜索瞬态事件中使用低能量中微子的可能性。
translated by 谷歌翻译
自我监督的学习(SSL)方法正在实现越来越多的深度学习模型,可以在难以获得标签的域中的图像数据集上进行培训。但是,这些方法难以扩展到医学成像数据集的高分辨率,在这些数据集中,它们对于在标签 - 筛选医学图像数据集上良好的概括至关重要。在这项工作中,我们提出了组织病理学数据集体(HDGAN)框架,该框架是图像生成和分割的数据集团半监督框架的扩展,可很好地扩展到大分辨率的组织病理学图像。我们从原始框架中进行了几个改编,包括更新生成骨干,从发电机中选择性提取潜在功能以及切换到内存映射数组。这些变化减少了框架的记忆消耗,改善了其对医学成像域的适用性。我们在血栓形成微型病变高分辨率瓷砖数据集上评估HDGAN,这表明高分辨率的图像通量生成任务的性能很强。我们希望这项工作能够在医学成像域中更多地探索对医学成像域中的自我监管框架的更多探索,从而使更多深度学习模型在医学数据集中进行更多应用。
translated by 谷歌翻译
肺癌是全球癌症死亡的主要原因,肺腺癌是最普遍的肺癌形式。 EGFR阳性肺腺癌已被证明对TKI治疗的反应率很高,这是肺癌分子测试的基本性质。尽管目前的指南考虑必要测试,但很大一部分患者并未常规化,导致数百万的人未接受最佳治疗肺癌。测序是EGFR突变分子测试的黄金标准,但是结果可能需要数周的时间才能回来,这在时间限制的情况下并不理想。能够快速,便宜地检测EGFR突变的替代筛查工具的开发,同时保存组织以进行测序可以帮助减少受比较治疗的患者的数量。我们提出了一种多模式方法,该方法将病理图像和临床变量整合在一起,以预测EGFR突变状态,迄今为止最大的临床队列中的AUC为84%。这样的计算模型可以以很少的额外成本进行大部分部署。它的临床应用可以减少中国接受亚最佳治疗的患者数量53.1%,在美国将高达96.6%的患者减少96.6%。
translated by 谷歌翻译
语言模型既展示了定量的改进,又展示了新的定性功能,随着规模的增加。尽管它们具有潜在的变革性影响,但这些新能力的特征却很差。为了为未来的研究提供信息,为破坏性的新模型能力做准备,并改善社会有害的效果,至关重要的是,我们必须了解目前和近乎未来的能力和语言模型的局限性。为了应对这一挑战,我们介绍了超越模仿游戏基准(Big Bench)。 Big Bench目前由204个任务组成,由132家机构的442位作者贡献。任务主题是多样的,从语言学,儿童发展,数学,常识性推理,生物学,物理学,社会偏见,软件开发等等。 Big-Bench专注于被认为超出当前语言模型的功能的任务。我们评估了OpenAI的GPT型号,Google内部密集变压器体系结构和大型基础上的开关稀疏变压器的行为,跨越了数百万到数十亿个参数。此外,一个人类专家评估者团队执行了所有任务,以提供强大的基准。研究结果包括:模型性能和校准都随规模改善,但绝对的术语(以及与评估者的性能相比);在模型类中的性能非常相似,尽管带有稀疏性。逐渐和预测的任务通常涉及大量知识或记忆成分,而在临界规模上表现出“突破性”行为的任务通常涉及多个步骤或组成部分或脆性指标;社交偏见通常会随着含糊不清的环境而随着规模而增加,但这可以通过提示来改善。
translated by 谷歌翻译
偏差对假设形成的影响特征在于自动化数据驱动投影追求神经网络,以提取和选择数据流的二进制分类的特征。此智能探索流程将一个完整的向量状态空间分区为不相交的子空间,以创建通过在两组标记数据流之间观察到的相似之处和差异来创建工作假设。数据流通常是时间测序,并且可以表现出复杂的时空模式。例如,给定的来自分子动力学仿真的原子轨迹,机器的任务是通过比较蛋白质突变体,一些已知的功能来量化促进功能的动态机制,而其他功能是非功能。利用模仿功能性和非功能突蛋白的动态的合成二维分子,在机器学习模型和选择的培训数据下识别并控制偏差,并在不同的环境下进行偏置。基于相关的视角,工作假设的改进融合到统计上稳健的多变量对数据的感知。在数据探索期间包括不同的视角,增强了多元性表征相似性和差异的可解释性。
translated by 谷歌翻译
Quantum神经网络(QNN)围绕有效分析量子数据产生兴奋。但是,对于许多QNN架构,这种兴奋是通过指数消失的梯度的存在,被称为贫瘠高原景观。最近,已经提出了量子卷积神经网络(QCNNS),涉及一系列卷积和汇集层,其减少Qubits的数量,同时保留有关相关数据特征的信息。在这项工作中,我们严格地分析了QCNN架构中参数的渐变缩放。我们发现梯度的方差不会比多项式更快地消失,这意味着QCNN不会表现出贫瘠的强力。这为随机初始化QCNN的培训提供了一种分析保证,该初始化QCNNS突出显示QCNNS在随机初始化下是与许多其他QNN架构的可训练。为了获得我们的结果,我们介绍了一种基于图形的基于图形的方法,以分析哈尔分布式统一的预期值,这可能在其他情况下很有用。最后,我们执行数值模拟以验证我们的分析结果。
translated by 谷歌翻译