智能论文笔记

An Application of Online Learning to Spacecraft Memory Dump Optimization

Tommaso Cesari , Jonathan Pergoli , Michele Maestrini , Pierluigi Di Lizia

分类：机器学习

2022-02-14

在本文中，我们向空间运营领域的专家建议介绍了在线学习的现实应用，并对来自哥白尼Sentinel-6卫星的现实生活数据进行了测试。我们表明，与传统技术相比，在航天器内存转储优化的优化中，一种轻巧的跟随算法会导致性能的增加超过60 \％$。

translated by 谷歌翻译

Reinforcement Learning: A Survey

L. P. Kaelbling , M. L. Littman , A. W. Moore

分类：

1996-05-01

This paper surveys the eld of reinforcement learning from a computer-science perspective. It is written to be accessible to researchers familiar with machine learning. Both the historical basis of the eld and a broad selection of current work are summarized. Reinforcement learning is the problem faced by an agent that learns behavior through trial-and-error interactions with a dynamic environment. The work described here has a resemblance to work in psychology, but di ers considerably in the details and in the use of the word \reinforcement." The paper discusses central issues of reinforcement learning, including trading o exploration and exploitation, establishing the foundations of the eld via Markov decision theory, learning from delayed reinforcement, constructing empirical models to accelerate learning, making use of generalization and hierarchy, and coping with hidden state. It concludes with a survey of some implemented systems and an assessment of the practical utility of current methods for reinforcement learning.

translated by 谷歌翻译

Reinforcement Learning for Feedback-Enabled Cyber Resilience

Yunhan Huang , Linan Huang , Quanyan Zhu

分类：机器学习

2021-07-02

数字化和远程连接扩大了攻击面，使网络系统更脆弱。由于攻击者变得越来越复杂和资源丰富，仅仅依赖传统网络保护，如入侵检测，防火墙和加密，不足以保护网络系统。网络弹性提供了一种新的安全范式，可以使用弹性机制来补充保护不足。一种网络弹性机制（CRM）适应了已知的或零日威胁和实际威胁和不确定性，并对他们进行战略性地响应，以便在成功攻击时保持网络系统的关键功能。反馈架构在启用CRM的在线感应，推理和致动过程中发挥关键作用。强化学习（RL）是一个重要的工具，对网络弹性的反馈架构构成。它允许CRM提供有限或没有事先知识和攻击者的有限攻击的顺序响应。在这项工作中，我们审查了Cyber恢复力的RL的文献，并讨论了对三种主要类型的漏洞，即姿势有关，与信息相关的脆弱性的网络恢复力。我们介绍了三个CRM的应用领域：移动目标防御，防守网络欺骗和辅助人类安全技术。 RL算法也有漏洞。我们解释了RL的三个漏洞和目前的攻击模型，其中攻击者针对环境与代理商之间交换的信息：奖励，国家观察和行动命令。我们展示攻击者可以通过最低攻击努力来欺骗RL代理商学习邪恶的政策。最后，我们讨论了RL为基于RL的CRM的网络安全和恢复力和新兴应用的未来挑战。

translated by 谷歌翻译

Behavior Trees in Robotics and AI: An Introduction

Michele Colledanchise , Petter Ögren

分类：机器人 | 人工智能

2017-08-31

行为树（BT）是一种在自主代理中（例如机器人或计算机游戏中的虚拟实体）之间在不同任务之间进行切换的方法。 BT是创建模块化和反应性的复杂系统的一种非常有效的方法。这些属性在许多应用中至关重要，这导致BT从计算机游戏编程到AI和机器人技术的许多分支。在本书中，我们将首先对BTS进行介绍，然后我们描述BTS与早期切换结构的关系，并且在许多情况下如何概括。然后，这些想法被用作一套高效且易于使用的设计原理的基础。安全性，鲁棒性和效率等属性对于自主系统很重要，我们描述了一套使用BTS的状态空间描述正式分析这些系统的工具。借助新的分析工具，我们可以对BTS如何推广早期方法的形式形式化。我们还显示了BTS在自动化计划和机器学习中的使用。最后，我们描述了一组扩展的工具，以捕获随机BT的行为，其中动作的结果由概率描述。这些工具可以计算成功概率和完成时间。

translated by 谷歌翻译

Monte Carlo Tree Search: A Review of Recent Modifications and Applications

Maciej Świechowski , Konrad Godlewski , Bartosz Sawicki , Jacek Mańdziuk

分类：人工智能 | 机器学习

2021-03-08

蒙特卡洛树搜索（MCT）是设计游戏机器人或解决顺序决策问题的强大方法。该方法依赖于平衡探索和开发的智能树搜索。MCT以模拟的形式进行随机抽样，并存储动作的统计数据，以在每个随后的迭代中做出更有教育的选择。然而，该方法已成为组合游戏的最新技术，但是，在更复杂的游戏（例如那些具有较高的分支因素或实时系列的游戏）以及各种实用领域（例如，运输，日程安排或安全性）有效的MCT应用程序通常需要其与问题有关的修改或与其他技术集成。这种特定领域的修改和混合方法是本调查的主要重点。最后一项主要的MCT调查已于2012年发布。自发布以来出现的贡献特别感兴趣。

translated by 谷歌翻译

Operations for Autonomous Spacecraft

Rebecca Castano , Tiago Vaquero , Federico Rossi , Vandi Verma , Ellen Van Wyk , Dan Allard , Bennett Huffmann , Erin M. Murphy , Nihal Dhamani , Robert A. Hewitt

分类：机器人 | 人工智能

2021-11-22

船上自治技术，如规划和调度，识别科学目标和基于内容的数据摘要，将导致令人兴奋的新空间科学任务。然而，尚未研究具有此类船上自治能力的经营任务的挑战，这是足以在使命概念中考虑的细节水平。这些自主功能需要更改当前的操作流程，实践和工具。我们制定了一个案例研究，以评估使运营商和科学家通过促进地面人员和车载算法之间的共同模型来运营自主航天器所需的变化。我们评估使运营商和科学家能够向航天器传达所需的新的操作工具和工作流程，并能够重建和解释船上和航天器状态的决定。这些工具的模型用于用户学习，了解过程和工具在实现共享理解框架方面的有效性，以及在运营商和科学家有效实现特派团科学目标的能力。

translated by 谷歌翻译

Team CERBERUS Wins the DARPA Subterranean Challenge: Technical Overview and Lessons Learned

Marco Tranzatto , Mihir Dharmadhikari , Lukas Bernreiter , Marco Camurri , Shehryar Khattak , Frank Mascarich , Patrick Pfreundschuh , David Wisth , Samuel Zimmermann , Mihir Kulkarni

分类：机器人

2022-07-11

本文介绍了Cerberus机器人系统系统，该系统赢得了DARPA Subterranean挑战最终活动。出席机器人自主权。由于其几何复杂性，降解的感知条件以及缺乏GPS支持，严峻的导航条件和拒绝通信，地下设置使自动操作变得特别要求。为了应对这一挑战，我们开发了Cerberus系统，该系统利用了腿部和飞行机器人的协同作用，再加上可靠的控制，尤其是为了克服危险的地形，多模式和多机器人感知，以在传感器退化，以及在传感器退化的条件下进行映射以及映射通过统一的探索路径计划和本地运动计划，反映机器人特定限制的弹性自主权。 Cerberus基于其探索各种地下环境及其高级指挥和控制的能力，表现出有效的探索，对感兴趣的对象的可靠检测以及准确的映射。在本文中，我们报告了DARPA地下挑战赛的初步奔跑和最终奖项的结果，并讨论了为社区带来利益的教训所面临的亮点和挑战。

translated by 谷歌翻译

Flexible Supervised Autonomy for Exploration in Subterranean Environments

Harel Biggie , Eugene R. Rush , Danny G. Riley , Shakeeb Ahmad , Michael T. Ohradzansky , Kyle Harlow , Michael J. Miles , Daniel Torres , Steve McGuire , Eric W. Frew

分类：机器人

2023-01-02

While the capabilities of autonomous systems have been steadily improving in recent years, these systems still struggle to rapidly explore previously unknown environments without the aid of GPS-assisted navigation. The DARPA Subterranean (SubT) Challenge aimed to fast track the development of autonomous exploration systems by evaluating their performance in real-world underground search-and-rescue scenarios. Subterranean environments present a plethora of challenges for robotic systems, such as limited communications, complex topology, visually-degraded sensing, and harsh terrain. The presented solution enables long-term autonomy with minimal human supervision by combining a powerful and independent single-agent autonomy stack, with higher level mission management operating over a flexible mesh network. The autonomy suite deployed on quadruped and wheeled robots was fully independent, freeing the human supervision to loosely supervise the mission and make high-impact strategic decisions. We also discuss lessons learned from fielding our system at the SubT Final Event, relating to vehicle versatility, system adaptability, and re-configurable communications.

translated by 谷歌翻译

Active Sensing for Search and Tracking: A Review

Luca Varotto , Angelo Cenedese , Andrea Cavallaro

分类：机器人

2021-12-04

主动位置估计（APE）是使用一个或多个传感平台本地化一个或多个目标的任务。 APE是搜索和拯救任务，野生动物监测，源期限估计和协作移动机器人的关键任务。 APE的成功取决于传感平台的合作水平，他们的数量，他们的自由度和收集的信息的质量。 APE控制法通过满足纯粹剥削或纯粹探索性标准，可以实现主动感测。前者最大限度地减少了位置估计的不确定性;虽然后者驱动了更接近其任务完成的平台。在本文中，我们定义了系统地分类的主要元素，并批判地讨论该域中的最新状态。我们还提出了一个参考框架作为对截图相关的解决方案的形式主义。总体而言，本调查探讨了主要挑战，并设想了本地化任务的自主感知系统领域的主要研究方向。促进用于搜索和跟踪应用的强大主动感测方法的开发也有益。

translated by 谷歌翻译

System Resilience through Health Monitoring and Reconfiguration

Ion Matei , Wiktor Piotrowski , Alexandre Perez , Johan de Kleer , Jorge Tierno , Wendy Mungovan , Vance Turnewitsch

分类：人工智能

2022-08-30

我们展示了一个端到端框架，以提高人造系统对不可预见的事件的弹性。该框架基于基于物理的数字双胞胎模型和三个负责实时故障诊断，预后和重新配置的模块。故障诊断模块使用基于模型的诊断算法来检测和分离断层，并在系统中产生干预措施，以消除不确定的诊断解决方案。我们通过使用基于物理学的数字双胞胎的平行化和替代模型来扩展故障诊断算法为所需的实时性能。预后模块跟踪故障进度，并训练在线退化模型，以计算系统组件的剩余使用寿命。此外，我们使用降解模型来评估断层进程对操作要求的影响。重新配置模块使用基于PDDL的计划，并带有语义附件来调整系统控件，从而最大程度地减少了对系统操作的故障影响。我们定义一个弹性度量，并以燃料系统模型的示例来说明该指标如何通过我们的框架改进。

translated by 谷歌翻译

RLOps: Development Life-cycle of Reinforcement Learning Aided Open RAN

Peizheng Li , Jonathan Thomas , Xiaoyang Wang , Ahmed Khalil , Abdelrahim Ahmad , Rui Inacio , Shipra Kapoor , Arjun Parekh , Angela Doufexi , Arman Shojaeifard

分类：机器学习

2021-11-12

无线电接入网络（RAN）技术继续见证巨大的增长，开放式运行越来越最近的势头。在O-RAN规范中，RAN智能控制器（RIC）用作自动化主机。本文介绍了对O-RAN堆栈相关的机器学习（ML）的原则，特别是加强学习（RL）。此外，我们审查无线网络的最先进的研究，并将其投入到RAN框架和O-RAN架构的层次结构上。我们在整个开发生命周期中提供ML / RL模型面临的挑战的分类：从系统规范到生产部署（数据采集，模型设计，测试和管理等）。为了解决挑战，我们将一组现有的MLOPS原理整合，当考虑RL代理时，具有独特的特性。本文讨论了系统的生命周期模型开发，测试和验证管道，称为：RLOPS。我们讨论了RLOP的所有基本部分，包括：模型规范，开发和蒸馏，生产环境服务，运营监控，安全/安全和数据工程平台。根据这些原则，我们提出了最佳实践，以实现自动化和可重复的模型开发过程。

translated by 谷歌翻译

Machine Learning for Combinatorial Optimization: a Methodological Tour d'Horizon

Yoshua Bengio , Andrea Lodi , Antoine Prouvost

分类：

2018-11-15

This paper surveys the recent attempts, both from the machine learning and operations research communities, at leveraging machine learning to solve combinatorial optimization problems. Given the hard nature of these problems, state-of-the-art algorithms rely on handcrafted heuristics for making decisions that are otherwise too expensive to compute or mathematically not well defined. Thus, machine learning looks like a natural candidate to make such decisions in a more principled and optimized way. We advocate for pushing further the integration of machine learning and combinatorial optimization and detail a methodology to do so. A main point of the paper is seeing generic optimization problems as data points and inquiring what is the relevant distribution of problems to use for learning on a given task.

translated by 谷歌翻译

How to Certify Machine Learning Based Safety-critical Systems? A Systematic Literature Review

Florian Tambon , Gabriel Laberge , Le An , Amin Nikanjam , Paulina Stevia Nouwou Mindom , Yann Pequignot , Foutse Khomh , Giulio Antoniol , Ettore Merlo , François Laviolette

分类：机器学习

2021-07-26

背景信息：在过去几年中，机器学习（ML）一直是许多创新的核心。然而，包括在所谓的“安全关键”系统中，例如汽车或航空的系统已经被证明是非常具有挑战性的，因为ML的范式转变为ML带来完全改变传统认证方法。目的：本文旨在阐明与ML为基础的安全关键系统认证有关的挑战，以及文献中提出的解决方案，以解决它们，回答问题的问题如何证明基于机器学习的安全关键系统？'方法：我们开展2015年至2020年至2020年之间发布的研究论文的系统文献综述（SLR），涵盖了与ML系统认证有关的主题。总共确定了217篇论文涵盖了主题，被认为是ML认证的主要支柱：鲁棒性，不确定性，解释性，验证，安全强化学习和直接认证。我们分析了每个子场的主要趋势和问题，并提取了提取的论文的总结。结果：单反结果突出了社区对该主题的热情，以及在数据集和模型类型方面缺乏多样性。它还强调需要进一步发展学术界和行业之间的联系，以加深域名研究。最后，它还说明了必须在上面提到的主要支柱之间建立连接的必要性，这些主要柱主要主要研究。结论：我们强调了目前部署的努力，以实现ML基于ML的软件系统，并讨论了一些未来的研究方向。

translated by 谷歌翻译

Deep Learning-Driven Edge Video Analytics: A Survey

Renjie Xu , Saiedeh Razavi , Rong Zheng

分类：计算机视觉 | 机器学习

2022-11-28

Video, as a key driver in the global explosion of digital information, can create tremendous benefits for human society. Governments and enterprises are deploying innumerable cameras for a variety of applications, e.g., law enforcement, emergency management, traffic control, and security surveillance, all facilitated by video analytics (VA). This trend is spurred by the rapid advancement of deep learning (DL), which enables more precise models for object classification, detection, and tracking. Meanwhile, with the proliferation of Internet-connected devices, massive amounts of data are generated daily, overwhelming the cloud. Edge computing, an emerging paradigm that moves workloads and services from the network core to the network edge, has been widely recognized as a promising solution. The resulting new intersection, edge video analytics (EVA), begins to attract widespread attention. Nevertheless, only a few loosely-related surveys exist on this topic. A dedicated venue for collecting and summarizing the latest advances of EVA is highly desired by the community. Besides, the basic concepts of EVA (e.g., definition, architectures, etc.) are ambiguous and neglected by these surveys due to the rapid development of this domain. A thorough clarification is needed to facilitate a consensus on these concepts. To fill in these gaps, we conduct a comprehensive survey of the recent efforts on EVA. In this paper, we first review the fundamentals of edge computing, followed by an overview of VA. The EVA system and its enabling techniques are discussed next. In addition, we introduce prevalent frameworks and datasets to aid future researchers in the development of EVA systems. Finally, we discuss existing challenges and foresee future research directions. We believe this survey will help readers comprehend the relationship between VA and edge computing, and spark new ideas on EVA.

translated by 谷歌翻译

Partially Observable Markov Decision Processes in Robotics: A Survey

Mikko Lauri , David Hsu , Joni Pajarinen

分类：机器人 | 人工智能

2022-09-21

嘈杂的传感，不完美的控制和环境变化是许多现实世界机器人任务的定义特征。部分可观察到的马尔可夫决策过程（POMDP）提供了一个原则上的数学框架，用于建模和解决不确定性下的机器人决策和控制任务。在过去的十年中，它看到了许多成功的应用程序，涵盖了本地化和导航，搜索和跟踪，自动驾驶，多机器人系统，操纵和人类机器人交互。这项调查旨在弥合POMDP模型的开发与算法之间的差距，以及针对另一端的不同机器人决策任务的应用。它分析了这些任务的特征，并将它们与POMDP框架的数学和算法属性联系起来，以进行有效的建模和解决方案。对于从业者来说，调查提供了一些关键任务特征，以决定何时以及如何成功地将POMDP应用于机器人任务。对于POMDP算法设计师，该调查为将POMDP应用于机器人系统的独特挑战提供了新的见解，并指出了有希望的新方向进行进一步研究。

translated by 谷歌翻译

Enabling Integration and Interaction for Decentralized Artificial Intelligence in Airline Disruption Management

Kolawole Ogunsina , Daniel DeLaurentis

分类：人工智能 | 神经与进化计算

2021-04-07

航空公司中断管理传统上寻求满足三个问题尺寸：飞机调度，船员调度和乘客调度。然而，目前的努力最多只解决了同时解决了前两个问题维度，并且不考虑一个维度在另一个维度上的不确定调度结果的传播效果。此外，现有航空公司中断管理方法包括人类专家，他们决定航空公司时间表中的必要纠正措施。然而，人类专家的能力受到处理大量信息的必要性，以便在中断管理中同时解决所有问题维度的强大决策。因此，需要增加人类专家的决策能力，具有可以在航空公司中断管理中的所有维度之间合理化复杂的相互作用的定量和定性工具，并为航空公司运营控制中心的专家提供客观的见解。为此，我们通过智能多助理系统在航空公司中断管理期间，通过采用人工智能和分布式分析技术原则的智能多助理系统，提供讨论和证明迅速的同时综合恢复所有问题尺寸的迅速综合恢复。结果表明，我们在多项式时间中同时综合恢复的范例在多项式时间中执行，并且当航空公司路线网络中的所有航班被中断时是有效的。

translated by 谷歌翻译

Imitation learning: A survey of learning methods

分类：

Imitation learning techniques aim to mimic human behavior in a given task. An agent (a learning machine) is trained to perform a task from demonstrations by learning a mapping between observations and actions. The idea of teaching by imitation has been around for many years, however, the field is gaining attention recently due to advances in computing and sensing as well as rising demand for intelligent applications. The paradigm of learning by imitation is gaining popularity because it facilitates teaching complex tasks with minimal expert knowledge of the tasks. Generic imitation learning methods could potentially reduce the problem of teaching a task to that of providing demonstrations; without the need for explicit programming or designing reward functions specific to the task. Modern sensors are able to collect and transmit high volumes of data rapidly, and processors with high computational power allow fast processing that maps the sensory data to actions in a timely manner. This opens the door for many potential AI applications that require real-time perception and reaction such as humanoid robots, self-driving vehicles, human computer interaction and computer games to name a few. However, specialized algorithms are needed to effectively and robustly learn models as learning by imitation poses its own set of challenges. In this paper, we survey imitation learning methods and present design options in different steps of the learning process. We introduce a background and motivation for the field as well as highlight challenges specific to the imitation problem. Methods for designing and evaluating imitation learning tasks are categorized and reviewed. Special attention is given to learning methods in robotics and games as these domains are the most popular in the literature and provide a wide array of problems and methodologies. We extensively discuss combining imitation learning approaches using different sources and methods, as well as incorporating other motion learning methods to enhance imitation. We also discuss the potential impact on industry, present major applications and highlight current and future research directions.

translated by 谷歌翻译

A Review: Challenges and Opportunities for Artificial Intelligence and Robotics in the Offshore Wind Sector

Daniel Mitchell , Jamie Blanche , Sam Harper , Theodore Lim , Ranjeetkumar Gupta , Osama Zaki , Wenshuo Tang , Valentin Robu , Simon Watson , David Flynn

分类：机器人

2021-12-13

在迅速增长的海上风电场市场中出现了增加风力涡轮机尺寸和距离的全球趋势。在英国，海上风电业于2019年生产了英国最多的电力，前一年增加了19.6％。目前，英国将进一步增加产量，旨在增加安装的涡轮机容量74.7％，如最近的冠村租赁轮次反映。通过如此巨大的增长，该部门现在正在寻求机器人和人工智能（RAI），以解决生命周期服务障碍，以支持可持续和有利可图的海上风能生产。如今，RAI应用主要用于支持运营和维护的短期目标。然而，前进，RAI在海上风基础设施的全部生命周期中有可能发挥关键作用，从测量，规划，设计，物流，运营支持，培训和退役。本文介绍了离岸可再生能源部门的RAI的第一个系统评论之一。在当前和未来的要求方面，在行业和学术界的离岸能源需求分析了rai的最先进的。我们的评论还包括对支持RAI的投资，监管和技能开发的详细评估。通过专利和学术出版数据库进行详细分析确定的关键趋势，提供了对安全合规性和可靠性的自主平台认证等障碍的见解，这是自主车队中可扩展性的数字架构，适应性居民运营和优化的适应性规划人机互动对人与自治助理的信赖伙伴关系。

translated by 谷歌翻译

A Comprehensive Survey on the Convergence of Vehicular Social Networks and Fog Computing

Farimasadat Miri , Richard Pazzi

分类：人工智能

2021-11-30

近年来，物联网设备的数量越来越快，这导致了用于管理，存储，分析和从不同物联网设备的原始数据做出决定的具有挑战性的任务，尤其是对于延时敏感的应用程序。在车辆网络（VANET）环境中，由于常见的拓扑变化，车辆的动态性质使当前的开放研究发出更具挑战性，这可能导致车辆之间断开连接。为此，已经在5G基础设施上计算了云和雾化的背景下提出了许多研究工作。另一方面，有多种研究提案旨在延长车辆之间的连接时间。已经定义了车辆社交网络（VSN）以减少车辆之间的连接时间的负担。本调查纸首先提供了关于雾，云和相关范例，如5G和SDN的必要背景信息和定义。然后，它将读者介绍给车辆社交网络，不同的指标和VSN和在线社交网络之间的主要差异。最后，本调查调查了在展示不同架构的VANET背景下的相关工作，以解决雾计算中的不同问题。此外，它提供了不同方法的分类，并在雾和云的上下文中讨论所需的指标，并将其与车辆社交网络进行比较。与VSN和雾计算领域的新研究挑战和趋势一起讨论了相关相关工程的比较。

translated by 谷歌翻译

Progress and summary of reinforcement learning on energy management of MPS-EV

Jincheng Hu , Yang Lin , Liang Chu , Zhuoran Hou , Jihan Li , Jingjing Jiang , Yuanjian Zhang

分类：机器学习

2022-11-08

The high emission and low energy efficiency caused by internal combustion engines (ICE) have become unacceptable under environmental regulations and the energy crisis. As a promising alternative solution, multi-power source electric vehicles (MPS-EVs) introduce different clean energy systems to improve powertrain efficiency. The energy management strategy (EMS) is a critical technology for MPS-EVs to maximize efficiency, fuel economy, and range. Reinforcement learning (RL) has become an effective methodology for the development of EMS. RL has received continuous attention and research, but there is still a lack of systematic analysis of the design elements of RL-based EMS. To this end, this paper presents an in-depth analysis of the current research on RL-based EMS (RL-EMS) and summarizes the design elements of RL-based EMS. This paper first summarizes the previous applications of RL in EMS from five aspects: algorithm, perception scheme, decision scheme, reward function, and innovative training method. The contribution of advanced algorithms to the training effect is shown, the perception and control schemes in the literature are analyzed in detail, different reward function settings are classified, and innovative training methods with their roles are elaborated. Finally, by comparing the development routes of RL and RL-EMS, this paper identifies the gap between advanced RL solutions and existing RL-EMS. Finally, this paper suggests potential development directions for implementing advanced artificial intelligence (AI) solutions in EMS.

translated by 谷歌翻译