The cooperation of a human pilot with an autonomous agent during flight control realizes parallel autonomy. A parallel-autonomous system acts as a guardian that significantly enhances the robustness and safety of flight operations in challenging circumstances. Here, we propose an air-guardian concept that facilitates cooperation between an artificial pilot agent and a parallel end-to-end neural control system. Our vision-based air-guardian system combines a causal continuous-depth neural network model with a cooperation layer to enable parallel autonomy between a pilot agent and a control system based on perceived differences in their attention profile. The attention profiles are obtained by computing the networks' saliency maps (feature importance) through the VisualBackProp algorithm. The guardian agent is trained via reinforcement learning in a fixed-wing aircraft simulated environment. When the attention profile of the pilot and guardian agents align, the pilot makes control decisions. If the attention map of the pilot and the guardian do not align, the air-guardian makes interventions and takes over the control of the aircraft. We show that our attention-based air-guardian system can balance the trade-off between its level of involvement in the flight and the pilot's expertise and attention. We demonstrate the effectivness of our methods in simulated flight scenarios with a fixed-wing aircraft and on a real drone platform.
translated by 谷歌翻译
We study the problem of training and certifying adversarially robust quantized neural networks (QNNs). Quantization is a technique for making neural networks more efficient by running them using low-bit integer arithmetic and is therefore commonly adopted in industry. Recent work has shown that floating-point neural networks that have been verified to be robust can become vulnerable to adversarial attacks after quantization, and certification of the quantized representation is necessary to guarantee robustness. In this work, we present quantization-aware interval bound propagation (QA-IBP), a novel method for training robust QNNs. Inspired by advances in robust learning of non-quantized networks, our training algorithm computes the gradient of an abstract representation of the actual network. Unlike existing approaches, our method can handle the discrete semantics of QNNs. Based on QA-IBP, we also develop a complete verification procedure for verifying the adversarial robustness of QNNs, which is guaranteed to terminate and produce a correct answer. Compared to existing approaches, the key advantage of our verification procedure is that it runs entirely on GPU or other accelerator devices. We demonstrate experimentally that our approach significantly outperforms existing methods and establish the new state-of-the-art for training and certifying the robustness of QNNs.
translated by 谷歌翻译
We study the problem of learning controllers for discrete-time non-linear stochastic dynamical systems with formal reach-avoid guarantees. This work presents the first method for providing formal reach-avoid guarantees, which combine and generalize stability and safety guarantees, with a tolerable probability threshold $p\in[0,1]$ over the infinite time horizon. Our method leverages advances in machine learning literature and it represents formal certificates as neural networks. In particular, we learn a certificate in the form of a reach-avoid supermartingale (RASM), a novel notion that we introduce in this work. Our RASMs provide reachability and avoidance guarantees by imposing constraints on what can be viewed as a stochastic extension of level sets of Lyapunov functions for deterministic systems. Our approach solves several important problems -- it can be used to learn a control policy from scratch, to verify a reach-avoid specification for a fixed control policy, or to fine-tune a pre-trained policy if it does not satisfy the reach-avoid specification. We validate our approach on $3$ stochastic non-linear reinforcement learning tasks.
translated by 谷歌翻译
线性状态空间模型(SSM)的状态过渡矩阵的适当参数化,然后是标准非线性,使他们能够从顺序数据中有效地学习表示形式,从。在本文中,我们表明,当线性液体时恒定(LTC)状态空间模型给出诸如S4之类的结构SSM时,我们可以进一步改善。 LTC神经网络是带有输入依赖性状态过渡模块的因果连续神经网络,这使他们学会在推理时适应传入的输入。我们表明,通过使用对角和S4中引入的状态过渡矩阵的对角线加低级分解以及一些简化的基于LTC的结构状态空间模型(称为Liquid-S4)实现了新的最新最先进的最先进跨序列建模任务具有长期依赖性(例如图像,文本,音频和医疗时间序列)的艺术概括,在远程竞技场基准中的平均性能为87.32%。在完整的原始语音命令识别中,数据集Liquid-S4的精度达到96.78%,与S4相比,参数计数降低了30%。性能的额外增益是液体-S4的核结构的直接结果,该结构考虑了训练和推理过程中输入序列样本的相似性。
translated by 谷歌翻译
我们考虑在离散时间非线性随机控制系统中正式验证几乎核实(A.S.)渐近稳定性的问题。在文献中广泛研究确定性控制系统中的验证稳定性,验证随机控制系统中的验证稳定性是一个开放的问题。本主题的少数现有的作品只考虑专门的瞬间形式,或对系统进行限制性假设,使其无法与神经网络策略的学习算法不适用。在这项工作中,我们提出了一种具有两种新颖方面的一般非线性随机控制问题的方法:(a)Lyapunov函数的经典随机扩展,我们使用排名超大地区(RSMS)来证明〜渐近稳定性,以及(B)我们提出一种学习神经网络RSM的方法。我们证明我们的方法保证了系统的渐近稳定性,并提供了第一种方法来获得稳定时间的界限,其中随机Lyapunov功能不。最后,我们在通过神经网络政策的一套非线性随机强化学习环境上通过实验验证我们的方法。
translated by 谷歌翻译
贝叶斯神经网络(BNNS)将分布放在神经网络的重量上,以模拟数据的不确定性和网络的预测。我们考虑在具有无限时间地平线系统的反馈循环中运行贝叶斯神经网络策略时验证安全的问题。与现有的基于样品的方法相比,这是不可用的无限时间地平线设置,我们训练一个单独的确定性神经网络,用作无限时间的地平线安全证书。特别是,我们证明证书网络保证了系统的安全性在BNN重量后部的子集上。我们的方法首先计算安全重量,然后改变BNN的重量后,以拒绝在该组外的样品。此外,我们展示了如何将我们的方法扩展到安全探索的强化学习环境,以避免在培训政策期间的不安全轨迹。我们在一系列加固学习基准上评估了我们的方法,包括非Lyapunovian安全规范。
translated by 谷歌翻译
我们介绍了一种新的随机验证算法,该算法正式地定量了配制成连续深度模型的任何连续过程的行为稳健性。我们的算法在给定的时间范围内解决了一组全局优化(GO)问题,以构造从初始状态的球开始的所有处理执行集的紧密机箱(管)。我们称我们的算法GoTube。通过其结构,GoTube确保边界管保守达到所需的概率和最高的紧密性。 GoTube以JAX实现,并优化以扩展到复杂的连续深度神经网络模型。与用于时间持续神经网络的高级可达性分析工具相比,GoTube不会在时间步骤之间积累过度估计误差,并避免符号技术中固有的臭名昭着包装效果。我们展示了GOTUBE在初始球,速度,时间 - 地平线,任务完成和大量实验中的可扩展性方面表现出最先进的验证工具。 GOTUBE是稳定的,并在其能够扩展到以前可能的视野的能力方面来设置最先进的。
translated by 谷歌翻译
We present a novel hybrid learning method, HyLEAR, for solving the collision-free navigation problem for self-driving cars in POMDPs. HyLEAR leverages interposed learning to embed knowledge of a hybrid planner into a deep reinforcement learner to faster determine safe and comfortable driving policies. In particular, the hybrid planner combines pedestrian path prediction and risk-aware path planning with driving-behavior rule-based reasoning such that the driving policies also take into account, whenever possible, the ride comfort and a given set of driving-behavior rules. Our experimental performance analysis over the CARLA-CTS1 benchmark of critical traffic scenarios revealed that HyLEAR can significantly outperform the selected baselines in terms of safety and ride comfort.
translated by 谷歌翻译
Purpose: Tracking the 3D motion of the surgical tool and the patient anatomy is a fundamental requirement for computer-assisted skull-base surgery. The estimated motion can be used both for intra-operative guidance and for downstream skill analysis. Recovering such motion solely from surgical videos is desirable, as it is compliant with current clinical workflows and instrumentation. Methods: We present Tracker of Anatomy and Tool (TAToo). TAToo jointly tracks the rigid 3D motion of patient skull and surgical drill from stereo microscopic videos. TAToo estimates motion via an iterative optimization process in an end-to-end differentiable form. For robust tracking performance, TAToo adopts a probabilistic formulation and enforces geometric constraints on the object level. Results: We validate TAToo on both simulation data, where ground truth motion is available, as well as on anthropomorphic phantom data, where optical tracking provides a strong baseline. We report sub-millimeter and millimeter inter-frame tracking accuracy for skull and drill, respectively, with rotation errors below 1{\deg}. We further illustrate how TAToo may be used in a surgical navigation setting. Conclusion: We present TAToo, which simultaneously tracks the surgical tool and the patient anatomy in skull-base surgery. TAToo directly predicts the motion from surgical videos, without the need of any markers. Our results show that the performance of TAToo compares favorably to competing approaches. Future work will include fine-tuning of our depth network to reach a 1 mm clinical accuracy goal desired for surgical applications in the skull base.
translated by 谷歌翻译
Machine learning (ML) models can leak information about users, and differential privacy (DP) provides a rigorous way to bound that leakage under a given budget. This DP budget can be regarded as a new type of compute resource in workloads of multiple ML models training on user data. Once it is used, the DP budget is forever consumed. Therefore, it is crucial to allocate it most efficiently to train as many models as possible. This paper presents the scheduler for privacy that optimizes for efficiency. We formulate privacy scheduling as a new type of multidimensional knapsack problem, called privacy knapsack, which maximizes DP budget efficiency. We show that privacy knapsack is NP-hard, hence practical algorithms are necessarily approximate. We develop an approximation algorithm for privacy knapsack, DPK, and evaluate it on microbenchmarks and on a new, synthetic private-ML workload we developed from the Alibaba ML cluster trace. We show that DPK: (1) often approaches the efficiency-optimal schedule, (2) consistently schedules more tasks compared to a state-of-the-art privacy scheduling algorithm that focused on fairness (1.3-1.7x in Alibaba, 1.0-2.6x in microbenchmarks), but (3) sacrifices some level of fairness for efficiency. Therefore, using DPK, DP ML operators should be able to train more models on the same amount of user data while offering the same privacy guarantee to their users.
translated by 谷歌翻译