随着行业的发展,无人机出现在各个领域。近年来,深厚的强化学习在游戏中取得了令人印象深刻的收益,我们致力于将深入的强化学习算法应用于机器人技术领域,将强化学习算法从游戏场景转移到现实世界中的应用程序场景。我们受到Openai Gym的Lunarlander的启发,我们决定在强化学习领域进行大胆的尝试以控制无人机。目前,在机器人控制上应用强化学习算法仍然缺乏工作,与机器人控制有关的物理模拟平台仅适用于经典算法的验证,并且不适合访问培训的增强学习算法。在本文中,我们将面对这个问题,弥合物理模拟平台和智能代理之间的差距,将智能代理连接到物理模拟平台,使代理可以在近似现实世界的模拟器中学习和完成无人机飞行任务。我们提出了一个基于凉亭的增强学习框架,该框架是一种物理模拟平台(ROS-RL),并在框架中使用了三个连续的动作空间增强算法来处理无人机自动降落问题。实验显示了算法的有效性,算法是基于强化学习的无人机自动着陆的任务,取得了全面的成功。
translated by 谷歌翻译
从大脑对听觉和视觉刺激的响应中的信息检索通过在记录脑电图信号时呈现给参与者的歌曲名称和图像类别的分类显示了成功。以重建听觉刺激的形式进行信息检索也显示出一些成功,但是在这里我们通过对音乐刺激的重建足够好,可以独立地看到和识别来改进以前的方法。此外,为每个相应的脑电图记录的一秒钟窗口,对深度学习模型进行了对时间对齐的音乐刺激谱的培训,与先前的研究相比,这大大降低了所需的提取步骤。参与者的NMED-TEMPO和NMED-HINDI数据集被动地收听全长歌曲,用于训练和验证卷积神经网络(CNN)回归器。测试了原始电压与功率谱输入以及线性与MEL频谱图的功效,并将所有输入和输出转换为2D图像。通过训练分类器评估了重建光谱图的质量,该分类器的MEL光谱图的精度为81%,线性光谱图(10%的机会精度)的精度为72%。最后,在两种抗性的匹配到样本任务中,听众以85%的成功率(50%机会)歧视听觉音乐刺激的重建。
translated by 谷歌翻译
以前的工作通常认为,改善卷积网络的空间不变性是对象计数的关键。但是,在验证了几个主流计数网络之后,我们出人意料地发现,太严格的像素级空间不变性将导致密度图生成中的噪声过高。在本文中,我们尝试使用本地连接的高斯内核来替换原始的卷积过滤器,以估计密度图中的空间位置。这样做的目的是允许特征提取过程潜在刺激密度生成过程以克服注释噪声。受到先前工作的启发,我们提出了一个低级别的近似值,并伴随着翻译不变性,以有利地实施大量高斯卷积的近似值。我们的工作指向了后续研究的新方向,该方向应该研究如何正确放松对象计数过于严格的像素级空间不变性。我们在4个主流对象计数网络(即MCNN,CSRNET,SANET和RESNET-50)上评估我们的方法。在7个流行的基准测试中进行了大量实验,用于3种应用(即人群,车辆和植物计数)。实验结果表明,我们的方法明显优于其他最先进的方法,并实现对物体空间位置的有希望的学习。
translated by 谷歌翻译
Federated learning (FL) is a machine learning setting where many clients (e.g. mobile devices or whole organizations) collaboratively train a model under the orchestration of a central server (e.g. service provider), while keeping the training data decentralized. FL embodies the principles of focused data collection and minimization, and can mitigate many of the systemic privacy risks and costs resulting from traditional, centralized machine learning and data science approaches. Motivated by the explosive growth in FL research, this paper discusses recent advances and presents an extensive collection of open problems and challenges.
translated by 谷歌翻译
While the capabilities of autonomous systems have been steadily improving in recent years, these systems still struggle to rapidly explore previously unknown environments without the aid of GPS-assisted navigation. The DARPA Subterranean (SubT) Challenge aimed to fast track the development of autonomous exploration systems by evaluating their performance in real-world underground search-and-rescue scenarios. Subterranean environments present a plethora of challenges for robotic systems, such as limited communications, complex topology, visually-degraded sensing, and harsh terrain. The presented solution enables long-term autonomy with minimal human supervision by combining a powerful and independent single-agent autonomy stack, with higher level mission management operating over a flexible mesh network. The autonomy suite deployed on quadruped and wheeled robots was fully independent, freeing the human supervision to loosely supervise the mission and make high-impact strategic decisions. We also discuss lessons learned from fielding our system at the SubT Final Event, relating to vehicle versatility, system adaptability, and re-configurable communications.
translated by 谷歌翻译
Attention mechanisms form a core component of several successful deep learning architectures, and are based on one key idea: ''The output depends only on a small (but unknown) segment of the input.'' In several practical applications like image captioning and language translation, this is mostly true. In trained models with an attention mechanism, the outputs of an intermediate module that encodes the segment of input responsible for the output is often used as a way to peek into the `reasoning` of the network. We make such a notion more precise for a variant of the classification problem that we term selective dependence classification (SDC) when used with attention model architectures. Under such a setting, we demonstrate various error modes where an attention model can be accurate but fail to be interpretable, and show that such models do occur as a result of training. We illustrate various situations that can accentuate and mitigate this behaviour. Finally, we use our objective definition of interpretability for SDC tasks to evaluate a few attention model learning algorithms designed to encourage sparsity and demonstrate that these algorithms help improve interpretability.
translated by 谷歌翻译
Recent advances in deep learning have enabled us to address the curse of dimensionality (COD) by solving problems in higher dimensions. A subset of such approaches of addressing the COD has led us to solving high-dimensional PDEs. This has resulted in opening doors to solving a variety of real-world problems ranging from mathematical finance to stochastic control for industrial applications. Although feasible, these deep learning methods are still constrained by training time and memory. Tackling these shortcomings, Tensor Neural Networks (TNN) demonstrate that they can provide significant parameter savings while attaining the same accuracy as compared to the classical Dense Neural Network (DNN). In addition, we also show how TNN can be trained faster than DNN for the same accuracy. Besides TNN, we also introduce Tensor Network Initializer (TNN Init), a weight initialization scheme that leads to faster convergence with smaller variance for an equivalent parameter count as compared to a DNN. We benchmark TNN and TNN Init by applying them to solve the parabolic PDE associated with the Heston model, which is widely used in financial pricing theory.
translated by 谷歌翻译
Artificial neural networks can learn complex, salient data features to achieve a given task. On the opposite end of the spectrum, mathematically grounded methods such as topological data analysis allow users to design analysis pipelines fully aware of data constraints and symmetries. We introduce a class of persistence-based neural network layers. Persistence-based layers allow the users to easily inject knowledge about symmetries (equivariance) respected by the data, are equipped with learnable weights, and can be composed with state-of-the-art neural architectures.
translated by 谷歌翻译
KL-regularized reinforcement learning from expert demonstrations has proved successful in improving the sample efficiency of deep reinforcement learning algorithms, allowing them to be applied to challenging physical real-world tasks. However, we show that KL-regularized reinforcement learning with behavioral reference policies derived from expert demonstrations can suffer from pathological training dynamics that can lead to slow, unstable, and suboptimal online learning. We show empirically that the pathology occurs for commonly chosen behavioral policy classes and demonstrate its impact on sample efficiency and online policy performance. Finally, we show that the pathology can be remedied by non-parametric behavioral reference policies and that this allows KL-regularized reinforcement learning to significantly outperform state-of-the-art approaches on a variety of challenging locomotion and dexterous hand manipulation tasks.
translated by 谷歌翻译
Three main points: 1. Data Science (DS) will be increasingly important to heliophysics; 2. Methods of heliophysics science discovery will continually evolve, requiring the use of learning technologies [e.g., machine learning (ML)] that are applied rigorously and that are capable of supporting discovery; and 3. To grow with the pace of data, technology, and workforce changes, heliophysics requires a new approach to the representation of knowledge.
translated by 谷歌翻译