Partially observable Markov decision processes (POMDPs) provide a flexible representation for real-world decision and control problems. However, POMDPs are notoriously difficult to solve, especially when the state and observation spaces are continuous or hybrid, which is often the case for physical systems. While recent online sampling-based POMDP algorithms that plan with observation likelihood weighting have shown practical effectiveness, a general theory characterizing the approximation error of the particle filtering techniques that these algorithms use has not previously been proposed. Our main contribution is bounding the error between any POMDP and its corresponding finite sample particle belief MDP (PB-MDP) approximation. This fundamental bridge between PB-MDPs and POMDPs allows us to adapt any sampling-based MDP algorithm to a POMDP by solving the corresponding particle belief MDP, thereby extending the convergence guarantees of the MDP algorithm to the POMDP. Practically, this is implemented by using the particle filter belief transition model as the generative model for the MDP solver. While this requires access to the observation density model from the POMDP, it only increases the transition sampling complexity of the MDP solver by a factor of $\mathcal{O}(C)$, where $C$ is the number of particles. Thus, when combined with sparse sampling MDP algorithms, this approach can yield algorithms for POMDPs that have no direct theoretical dependence on the size of the state and observation spaces. In addition to our theoretical contribution, we perform five numerical experiments on benchmark POMDPs to demonstrate that a simple MDP algorithm adapted using PB-MDP approximation, Sparse-PFT, achieves performance competitive with other leading continuous observation POMDP solvers.
translated by 谷歌翻译
本文提出了一个算法框架,用于控制符合信号时间逻辑(STL)规范的连续动力系统的合成。我们提出了一种新型算法,以从STL规范中获得时间分配的有限自动机,并引入一个多层框架,该框架利用此自动机以空间和时间上指导基于采样的搜索树。我们的方法能够合成非线性动力学和多项式谓词功能的控制器。我们证明了算法的正确性和概率完整性,并说明了我们在几个案例研究中框架的效率和功效。我们的结果表明,在艺术状态下,速度的速度是一定的。
translated by 谷歌翻译
在本文中,我们通过概率保证解决了基于采样的运动计划和测量不确定性的问题。我们概括了基于基于树的基于树木的运动计划算法,以确定性系统并提出信念-USHAMCAL {a} $,该框架将任何基于动力学的树的计划者扩展到线性(或可线化)系统的信念空间。我们为信仰空间介绍了适当的抽样技术和距离指标,以保留基础规划师的概率完整性和渐近最佳性能。我们证明了我们在模拟方面对自动化和非全面系统有效和渐近地找到安全低成本路径的疗效。
translated by 谷歌翻译
部分观察到的马尔可夫决策过程(POMDP)是一种强大的框架,用于捕获涉及状态和转换不确定性的决策问题。然而,大多数目前的POMDP规划者不能有效地处理它们经常在现实世界中遇到的非常高的观测(例如,机器人域中的图像观察)。在这项工作中,我们提出了视觉树搜索(VTS),一个学习和规划过程,将生成模型与基于在线模型的POMDP规划的脱机中学到的。 VTS通过利用一组深入生成观测模型来预测和评估蒙特卡罗树搜索计划员的图像观测的可能性,乘坐脱机模型培训和在线规划。我们展示VTS对不同观察噪声的强大稳健,因为它利用在线,基于模型的规划,可以适应不同的奖励结构,而无需重新列车。这种新方法优于基线最先进的策略计划算法,同时使用显着降低的离线培训时间。
translated by 谷歌翻译
Proteins play a central role in biology from immune recognition to brain activity. While major advances in machine learning have improved our ability to predict protein structure from sequence, determining protein function from structure remains a major challenge. Here, we introduce Holographic Convolutional Neural Network (H-CNN) for proteins, which is a physically motivated machine learning approach to model amino acid preferences in protein structures. H-CNN reflects physical interactions in a protein structure and recapitulates the functional information stored in evolutionary data. H-CNN accurately predicts the impact of mutations on protein function, including stability and binding of protein complexes. Our interpretable computational model for protein structure-function maps could guide design of novel proteins with desired function.
translated by 谷歌翻译
本文介绍了一个混合在线的部分可观察到的马尔可夫决策过程(POMDP)计划系统,该系统在存在环境中其他代理商引入的多模式不确定性的情况下解决了自主导航的问题。作为一个特别的例子,我们考虑了密集的行人和障碍物中的自主航行问题。该问题的流行方法首先使用完整的计划者(例如,混合A*)生成一条路径,具有对不确定性的临时假设,然后使用基于在线树的POMDP求解器来解决问题的不确定性,并控制问题的有限方面(即沿着路径的速度)。我们提出了一种更有能力和响应的实时方法,使POMDP规划师能够控制更多的自由度(例如,速度和标题),以实现更灵活,更有效的解决方案。这种修改大大扩展了POMDP规划师必须推荐的国家空间区域,从而大大提高了在实时控制提供的有限计算预算中找到有效的推出政策的重要性。我们的关键见解是使用多Query运动计划技术(例如,概率路线图或快速行进方法)作为先验,以快速生成在有限的地平线搜索中POMDP规划树可能达到的每个状态的高效推出政策。我们提出的方法产生的轨迹比以前的方法更安全,更有效,即使在较长的计划范围内密集拥挤的动态环境中。
translated by 谷歌翻译
TRISTRUCCUCTIONATIOPIC(TRISO)涂层颗粒燃料是强大的核燃料,并确定其可靠性对于先进的核技术的成功至关重要。然而,Triso失效概率很小,相关的计算模型很昂贵。我们使用耦合的主动学习,多尺度建模和子集模拟来估计使用几个1D和2D模型的Triso燃料的故障概率。通过多尺度建模,我们用来自两个低保真(LF)模型的信息融合,取代了昂贵的高保真(HF)模型评估。对于1D TRISO模型,我们考虑了三种多倍性建模策略:仅克里格,Kriging LF预测加克里格校正,深神经网络(DNN)LF预测加克里格校正。虽然这些多尺度建模策略的结果令人满意地比较了从两个LF模型中使用信息融合的策略,但是通常常常称为HF模型。接下来,对于2D Triso模型,我们考虑了两个多倍性建模策略:DNN LF预测加克里格校正(数据驱动)和1D Triso LF预测加克里格校正(基于物理学)。正如所预期的那样,基于物理的策略一直需要对HF模型的最少的呼叫。然而,由于DNN预测是瞬时的,数据驱动的策略具有较低的整体模拟时间,并且1D Triso模型需要不可忽略的模拟时间。
translated by 谷歌翻译
这项工作研究了以下假设:与人类驾驶状态的部分可观察到的马尔可夫决策过程(POMDP)计划可以显着提高自动高速公路驾驶的安全性和效率。我们在模拟场景中评估了这一假设,即自动驾驶汽车必须在快速连续中安全执行三个车道变化。通过观测扩大(POMCPOW)算法,通过部分可观察到的蒙特卡洛计划获得了近似POMDP溶液。这种方法的表现优于过度自信和保守的MDP基准,匹配或匹配效果优于QMDP。相对于MDP基准,POMCPOW通常将不安全情况的速率降低了一半或将成功率提高50%。
translated by 谷歌翻译
While the brain connectivity network can inform the understanding and diagnosis of developmental dyslexia, its cause-effect relationships have not yet enough been examined. Employing electroencephalography signals and band-limited white noise stimulus at 4.8 Hz (prosodic-syllabic frequency), we measure the phase Granger causalities among channels to identify differences between dyslexic learners and controls, thereby proposing a method to calculate directional connectivity. As causal relationships run in both directions, we explore three scenarios, namely channels' activity as sources, as sinks, and in total. Our proposed method can be used for both classification and exploratory analysis. In all scenarios, we find confirmation of the established right-lateralized Theta sampling network anomaly, in line with the temporal sampling framework's assumption of oscillatory differences in the Theta and Gamma bands. Further, we show that this anomaly primarily occurs in the causal relationships of channels acting as sinks, where it is significantly more pronounced than when only total activity is observed. In the sink scenario, our classifier obtains 0.84 and 0.88 accuracy and 0.87 and 0.93 AUC for the Theta and Gamma bands, respectively.
translated by 谷歌翻译
Differentiable Architecture Search (DARTS) has attracted considerable attention as a gradient-based Neural Architecture Search (NAS) method. Since the introduction of DARTS, there has been little work done on adapting the action space based on state-of-art architecture design principles for CNNs. In this work, we aim to address this gap by incrementally augmenting the DARTS search space with micro-design changes inspired by ConvNeXt and studying the trade-off between accuracy, evaluation layer count, and computational cost. To this end, we introduce the Pseudo-Inverted Bottleneck conv block intending to reduce the computational footprint of the inverted bottleneck block proposed in ConvNeXt. Our proposed architecture is much less sensitive to evaluation layer count and outperforms a DARTS network with similar size significantly, at layer counts as small as 2. Furthermore, with less layers, not only does it achieve higher accuracy with lower GMACs and parameter count, GradCAM comparisons show that our network is able to better detect distinctive features of target objects compared to DARTS.
translated by 谷歌翻译