智能论文笔记

Reinforcement Learning in Computing and Network Convergence Orchestration

Aidong Yang , Mohan Wu , Boquan Cheng , Xiaozhou Ye , Ye Ouyang

分类：人工智能

2022-09-22

随着计算能力已成为数字经济时代的核心生产力，计算和网络收敛的概念（CNC），根据用户的需求，可以动态地安排和分配网络和计算资源，并引起广泛关注。基于任务的属性，网络编排平面需要灵活地部署任务以适当计算节点并将路径安排到计算节点。这是一个涉及资源调度和路径布置的编排问题。由于CNC是相对较新的，因此在本文中，我们回顾了有关CNC的一些研究和应用。然后，我们使用强化学习（RL）设计了CNC编排方法，这是第一次尝试，可以灵活地分配和安排计算资源和网络资源。旨在高利润和低潜伏期。同时，我们使用多因素来确定优化目标，以便根据来自不同方面的总绩效（例如成本，利润，延迟和系统过载）在我们的实验中优化了编排策略。实验表明，与贪婪的方法，随机选择和平衡资源方法相比，提出的基于RL的方法可以实现更高的利润和更低的潜伏度。我们证明RL适合CNC编排。本文启动了RL关于CNC编排的应用程序。

translated by 谷歌翻译

Applications of Multi-Agent Reinforcement Learning in Future Internet: A Comprehensive Survey

Tianxu Li , Kun Zhu , Nguyen Cong Luong , Dusit Niyato , Qihui Wu , Yang Zhang , Bing Chen

分类：人工智能 | 机器学习

2021-10-26

未来的互联网涉及几种新兴技术，例如5G和5G网络，车辆网络，无人机（UAV）网络和物联网（IOT）。此外，未来的互联网变得异质并分散了许多相关网络实体。每个实体可能需要做出本地决定，以在动态和不确定的网络环境下改善网络性能。最近使用标准学习算法，例如单药强化学习（RL）或深入强化学习（DRL），以使每个网络实体作为代理人通过与未知环境进行互动来自适应地学习最佳决策策略。但是，这种算法未能对网络实体之间的合作或竞争进行建模，而只是将其他实体视为可能导致非平稳性问题的环境的一部分。多机构增强学习（MARL）允许每个网络实体不仅观察环境，还可以观察其他实体的政策来学习其最佳政策。结果，MAL可以显着提高网络实体的学习效率，并且最近已用于解决新兴网络中的各种问题。在本文中，我们因此回顾了MAL在新兴网络中的应用。特别是，我们提供了MARL的教程，以及对MARL在下一代互联网中的应用进行全面调查。特别是，我们首先介绍单代机Agent RL和MARL。然后，我们回顾了MAL在未来互联网中解决新兴问题的许多应用程序。这些问题包括网络访问，传输电源控制，计算卸载，内容缓存，数据包路由，无人机网络的轨迹设计以及网络安全问题。

translated by 谷歌翻译

The state-of-the-art review on resource allocation problem using artificial intelligence methods on various computing paradigms

Javad Hassannataj Joloudari , Sanaz Mojrian , Hamid Saadatfar , Issa Nodehi , Fatemeh Fazl , Sahar Khanjani shirkharkolaie , Roohallah Alizadehsani , H M Dipu Kabir , Ru-San Tan , U Rajendra Acharya

分类：人工智能

2022-03-23

With the increasing growth of information through smart devices, increasing the quality level of human life requires various computational paradigms presentation including the Internet of Things, fog, and cloud. Between these three paradigms, the cloud computing paradigm as an emerging technology adds cloud layer services to the edge of the network so that resource allocation operations occur close to the end-user to reduce resource processing time and network traffic overhead. Hence, the resource allocation problem for its providers in terms of presenting a suitable platform, by using computational paradigms is considered a challenge. In general, resource allocation approaches are divided into two methods, including auction-based methods(goal, increase profits for service providers-increase user satisfaction and usability) and optimization-based methods(energy, cost, network exploitation, Runtime, reduction of time delay). In this paper, according to the latest scientific achievements, a comprehensive literature study (CLS) on artificial intelligence methods based on resource allocation optimization without considering auction-based methods in various computing environments are provided such as cloud computing, Vehicular Fog Computing, wireless, IoT, vehicular networks, 5G networks, vehicular cloud architecture,machine-to-machine communication(M2M),Train-to-Train(T2T) communication network, Peer-to-Peer(P2P) network. Since deep learning methods based on artificial intelligence are used as the most important methods in resource allocation problems; Therefore, in this paper, resource allocation approaches based on deep learning are also used in the mentioned computational environments such as deep reinforcement learning, Q-learning technique, reinforcement learning, online learning, and also Classical learning methods such as Bayesian learning, Cummins clustering, Markov decision process.

translated by 谷歌翻译

Reinforcement Learning for Cognitive Delay/Disruption Tolerant Network Node Management in an LEO-based Satellite Constellation

Xue Sun , Changhao Li , Lei Yan , Suzhi Cao

分类：人工智能 | 机器学习

2022-09-27

近年来，随着空间航天器实体的大规模部署以及卫星在板载功能的增加，在过度网络动态的情况下，与TCP/IP相比，出现了比TCP/IP更强大的通信协议。 DTN节点缓冲区管理仍然是一个活跃的研究领域，因为DTN核心协议的当前实现仍然依赖于以下假设：在不同的网络节点中始终有足够的内存来存储和正向捆绑包。此外，经典排队理论不适用于DTN节点缓冲区的动态管理。因此，本文提出了一种集中式方法，以基于高级强化学习（RL）策略优势行动者 - 批评者（A2C）自动管理低地球（LEO）卫星星座中的认知DTN节点。该方法旨在探索培训地球同步地球轨道智能代理，以管理Leo卫星星座中的所有DTN节点。 A2C代理的目的是在考虑节点内存利用率的同时最大化交付成功率并最大程度地减少网络资源消耗成本。智能代理可以根据束优先级动态调整无线电数据速率并执行下降操作。为了衡量在LEO卫星星座场景中将A2C技术应用于DTN节点管理问题的有效性，本文将受过训练的智能代理策略与其他两种非RL政策进行了比较，包括随机和标准政策。实验表明，A2C策略平衡了交付成功率和成本，并提供了最高的奖励和最低的节点存储器利用率。

translated by 谷歌翻译

Beyond 5G Networks: Integration of Communication, Computing, Caching, and Control

Musbahu Mohammed Adam , Liqiang Zhao , Kezhi Wang , Zhu Han

分类：机器学习

2022-12-26

In recent years, the exponential proliferation of smart devices with their intelligent applications poses severe challenges on conventional cellular networks. Such challenges can be potentially overcome by integrating communication, computing, caching, and control (i4C) technologies. In this survey, we first give a snapshot of different aspects of the i4C, comprising background, motivation, leading technological enablers, potential applications, and use cases. Next, we describe different models of communication, computing, caching, and control (4C) to lay the foundation of the integration approach. We review current state-of-the-art research efforts related to the i4C, focusing on recent trends of both conventional and artificial intelligence (AI)-based integration approaches. We also highlight the need for intelligence in resources integration. Then, we discuss integration of sensing and communication (ISAC) and classify the integration approaches into various classes. Finally, we propose open challenges and present future research directions for beyond 5G networks, such as 6G.

translated by 谷歌翻译

DRL-M4MR: An Intelligent Multicast Routing Approach Based on DQN Deep Reinforcement Learning in SDN

Chenwei Zhao , Miao Ye , Xingsi Xue , Jianhui Lv , Qiuxiang Jiang , Yong Wang

分类：人工智能

2022-07-31

传统的多播路由方法在构建多播树时存在一些问题，例如对网络状态信息的访问有限，对网络的动态和复杂变化的适应性不佳以及不灵活的数据转发。为了解决这些缺陷，软件定义网络（SDN）中的最佳多播路由问题是根据多目标优化问题量身定制的，以及基于深Q网络（DQN）深度强化学习（DQN）的智能多播路由算法DRL-M4MR（ DRL）方法旨在构建SDN中的多播树。首先，通过组合SDN的全局视图和控制，将多播树状态矩阵，链路带宽矩阵，链路延迟矩阵和链路延迟损耗矩阵设计为DRL代理的状态空间。其次，代理的动作空间是网络中的所有链接，而动作选择策略旨在将链接添加到四种情况下的当前多播树。第三，单步和最终奖励功能表格旨在指导智能以做出决定以构建最佳多播树。实验结果表明，与现有算法相比，DRL-M4MR的多播树结构可以在训练后获得更好的带宽，延迟和数据包损耗率，并且可以在动态网络环境中做出更智能的多播路由决策。

translated by 谷歌翻译

Distributed Machine Learning for UAV Swarms: Computing, Sensing, and Semantics

Yahao Ding , Zhaohui Yang , Quoc-Viet Pham , Zhaoyang Zhang , Mohammad Shikh-Bahaei

分类：机器学习 | 人工智能

2023-01-03

Unmanned aerial vehicle (UAV) swarms are considered as a promising technique for next-generation communication networks due to their flexibility, mobility, low cost, and the ability to collaboratively and autonomously provide services. Distributed learning (DL) enables UAV swarms to intelligently provide communication services, multi-directional remote surveillance, and target tracking. In this survey, we first introduce several popular DL algorithms such as federated learning (FL), multi-agent Reinforcement Learning (MARL), distributed inference, and split learning, and present a comprehensive overview of their applications for UAV swarms, such as trajectory design, power control, wireless resource allocation, user assignment, perception, and satellite communications. Then, we present several state-of-the-art applications of UAV swarms in wireless communication systems, such us reconfigurable intelligent surface (RIS), virtual reality (VR), semantic communications, and discuss the problems and challenges that DL-enabled UAV swarms can solve in these applications. Finally, we describe open problems of using DL in UAV swarms and future research directions of DL enabled UAV swarms. In summary, this survey provides a comprehensive survey of various DL applications for UAV swarms in extensive scenarios.

translated by 谷歌翻译

CLARA: A Constrained Reinforcement Learning Based Resource Allocation Framework for Network Slicing

Yongshuai Liu , Jiaxin Ding , Zhi-Li Zhang , Xin Liu

分类：机器学习

2021-11-16

随着移动网络的增殖，我们正在遇到强大的服务多样化，这需要从现有网络的更大灵活性。建议网络切片作为5G和未来网络的资源利用解决方案，以解决这种可怕需求。在网络切片中，动态资源编排和网络切片管理对于最大化资源利用率至关重要。不幸的是，由于缺乏准确的模型和动态隐藏结构，这种过程对于传统方法来说太复杂。在不知道模型和隐藏结构的情况下，我们将问题作为受约束的马尔可夫决策过程（CMDP）制定。此外，我们建议使用Clara解决问题，这是一种基于钢筋的基于资源分配算法。特别是，我们分别使用自适应内部点策略优化和投影层分析累积和瞬时约束。评估表明，Clara明显优于资源配置的基线，通过服务需求保证。

translated by 谷歌翻译

Multi-Agent Deep Reinforcement Learning for Cost- and Delay-Sensitive Virtual Network Function Placement and Routing

Shaoyang Wang , Chau Yuen , Wei Ni , Guan Yong Liang , Tiejun Lv

分类：人工智能 | 机器学习

2022-06-24

本文提出了一种有效且新颖的多重深度强化学习（MADRL）的方法，用于解决联合虚拟网络功能（VNF）的位置和路由（P＆R），其中同时提供了具有差异性要求的多个服务请求。服务请求的差异要求反映出其延迟和成本敏感的因素。我们首先构建了VNF P＆R问题，以共同减少NP完整的服务延迟和资源消耗成本的加权总和。然后，将关节VNF P＆R问题分解为两个迭代子任务：放置子任务和路由子任务。每个子任务由多个并发并行顺序决策过程组成。通过调用深层确定性策略梯度方法和多代理技术，MADRL-P＆R框架旨在执行两个子任务。提出了新的联合奖励和内部奖励机制，以匹配安置和路由子任务的目标和约束。我们还提出了基于参数迁移的模型重新训练方法来处理不断变化的网络拓扑。通过实验证实，提议的MADRL-P＆R框架在服务成本和延迟方面优于其替代方案，并为个性化服务需求提供了更高的灵活性。基于参数迁移的模型重新训练方法可以在中等网络拓扑变化下有效加速收敛。

translated by 谷歌翻译

A Comprehensive Survey on the Convergence of Vehicular Social Networks and Fog Computing

Farimasadat Miri , Richard Pazzi

分类：人工智能

2021-11-30

近年来，物联网设备的数量越来越快，这导致了用于管理，存储，分析和从不同物联网设备的原始数据做出决定的具有挑战性的任务，尤其是对于延时敏感的应用程序。在车辆网络（VANET）环境中，由于常见的拓扑变化，车辆的动态性质使当前的开放研究发出更具挑战性，这可能导致车辆之间断开连接。为此，已经在5G基础设施上计算了云和雾化的背景下提出了许多研究工作。另一方面，有多种研究提案旨在延长车辆之间的连接时间。已经定义了车辆社交网络（VSN）以减少车辆之间的连接时间的负担。本调查纸首先提供了关于雾，云和相关范例，如5G和SDN的必要背景信息和定义。然后，它将读者介绍给车辆社交网络，不同的指标和VSN和在线社交网络之间的主要差异。最后，本调查调查了在展示不同架构的VANET背景下的相关工作，以解决雾计算中的不同问题。此外，它提供了不同方法的分类，并在雾和云的上下文中讨论所需的指标，并将其与车辆社交网络进行比较。与VSN和雾计算领域的新研究挑战和趋势一起讨论了相关相关工程的比较。

translated by 谷歌翻译

Deep Reinforcement Learning for Trajectory Path Planning and Distributed Inference in Resource-Constrained UAV Swarms

Marwan Dhuheir , Emna Baccour , Aiman Erbad , Sinan Sabeeh Al-Obaidi , Mounir Hamdi

分类：机器学习 | 机器人

2022-12-21

The deployment flexibility and maneuverability of Unmanned Aerial Vehicles (UAVs) increased their adoption in various applications, such as wildfire tracking, border monitoring, etc. In many critical applications, UAVs capture images and other sensory data and then send the captured data to remote servers for inference and data processing tasks. However, this approach is not always practical in real-time applications due to the connection instability, limited bandwidth, and end-to-end latency. One promising solution is to divide the inference requests into multiple parts (layers or segments), with each part being executed in a different UAV based on the available resources. Furthermore, some applications require the UAVs to traverse certain areas and capture incidents; thus, planning their paths becomes critical particularly, to reduce the latency of making the collaborative inference process. Specifically, planning the UAVs trajectory can reduce the data transmission latency by communicating with devices in the same proximity while mitigating the transmission interference. This work aims to design a model for distributed collaborative inference requests and path planning in a UAV swarm while respecting the resource constraints due to the computational load and memory usage of the inference requests. The model is formulated as an optimization problem and aims to minimize latency. The formulated problem is NP-hard so finding the optimal solution is quite complex; thus, this paper introduces a real-time and dynamic solution for online applications using deep reinforcement learning. We conduct extensive simulations and compare our results to the-state-of-the-art studies demonstrating that our model outperforms the competing models.

translated by 谷歌翻译

Attention-Based Model and Deep Reinforcement Learning for Distribution of Event Processing Tasks

A. Mazayev , F. Al-Tam , N. Correia

分类：机器学习

2021-12-07

事件处理是动态和响应互联网（物联网）的基石。该领域的最近方法基于代表性状态转移（REST）原则，其允许将事件处理任务放置在遵循相同原理的任何设备上。但是，任务应在边缘设备之间正确分布，以确保公平资源利用率和保证无缝执行。本文调查了深入学习的使用，以公平分配任务。提出了一种基于关注的神经网络模型，在不同场景下产生有效的负载平衡解决方案。所提出的模型基于变压器和指针网络架构，并通过Advantage演员批评批评学习算法训练。该模型旨在缩放到事件处理任务的数量和边缘设备的数量，不需要重新调整甚至再刷新。广泛的实验结果表明，拟议的模型在许多关键绩效指标中优于传统的启发式。通用设计和所获得的结果表明，所提出的模型可能适用于几个其他负载平衡问题变化，这使得该提案是由于其可扩展性和效率而在现实世界场景中使用的有吸引力的选择。

translated by 谷歌翻译

Deep Reinforcement Learning for Task Offloading in UAV-Aided Smart Farm Networks

Anne Catherine Nguyen , Turgay Pamuklu , Aisha Syed , W. Sean Kennedy , Melike Erol-Kantarci

分类：人工智能

2022-09-15

第五世代和第六代无线通信网络正在启用工具，例如物联网设备，无人驾驶汽车（UAV）和人工智能，以使用设备网络来改善农业景观，以自动监视农田。对大面积进行调查需要在特定时间段内执行许多图像分类任务，以防止发生事件发生的情况，例如火灾或洪水。无人机具有有限的能量和计算能力，并且可能无法在本地和适当的时间内执行所有强烈的图像分类任务。因此，假定无人机能够部分将其工作量分开到附近的多访问边缘计算设备。无人机需要一种决策算法，该算法将决定将执行任务的位置，同时还考虑网络中其他无人机的时间限制和能量级别。在本文中，我们介绍了一种深入的Q学习方法（DQL）来解决这个多目标问题。将所提出的方法与Q学习和三个启发式基线进行了比较，模拟结果表明，我们提出的基于DQL的方法在涉及无人机的剩余电池电量和违规截止日期的百分比时可相当。此外，我们的方法能够比Q学习快13倍。

translated by 谷歌翻译

Online Service Migration in Edge Computing with Incomplete Information: A Deep Recurrent Actor-Critic Method

Jin Wang , Jia Hu , Geyong Min , Qiang Ni , Tarek El-Ghazawi

分类：机器学习

2020-12-16

多访问边缘计算（MEC）是一个新兴的计算范式，将云计算扩展到网络边缘，以支持移动设备上的资源密集型应用程序。作为MEC的关键问题，服务迁移需要决定如何迁移用户服务，以维持用户在覆盖范围和容量有限的MEC服务器之间漫游的服务质量。但是，由于动态的MEC环境和用户移动性，找到最佳的迁移策略是棘手的。许多现有研究根据完整的系统级信息做出集中式迁移决策，这是耗时的，并且缺乏理想的可扩展性。为了应对这些挑战，我们提出了一种新颖的学习驱动方法，该方法以用户为中心，可以通过使用不完整的系统级信息来做出有效的在线迁移决策。具体而言，服务迁移问题被建模为可观察到的马尔可夫决策过程（POMDP）。为了解决POMDP，我们设计了一个新的编码网络，该网络结合了长期记忆（LSTM）和一个嵌入式矩阵，以有效提取隐藏信息，并进一步提出了一种定制的非政策型演员 - 批判性算法，以进行有效的训练。基于现实世界的移动性痕迹的广泛实验结果表明，这种新方法始终优于启发式和最先进的学习驱动算法，并且可以在各种MEC场景上取得近乎最佳的结果。

translated by 谷歌翻译

Progress and summary of reinforcement learning on energy management of MPS-EV

Jincheng Hu , Yang Lin , Liang Chu , Zhuoran Hou , Jihan Li , Jingjing Jiang , Yuanjian Zhang

分类：机器学习

2022-11-08

The high emission and low energy efficiency caused by internal combustion engines (ICE) have become unacceptable under environmental regulations and the energy crisis. As a promising alternative solution, multi-power source electric vehicles (MPS-EVs) introduce different clean energy systems to improve powertrain efficiency. The energy management strategy (EMS) is a critical technology for MPS-EVs to maximize efficiency, fuel economy, and range. Reinforcement learning (RL) has become an effective methodology for the development of EMS. RL has received continuous attention and research, but there is still a lack of systematic analysis of the design elements of RL-based EMS. To this end, this paper presents an in-depth analysis of the current research on RL-based EMS (RL-EMS) and summarizes the design elements of RL-based EMS. This paper first summarizes the previous applications of RL in EMS from five aspects: algorithm, perception scheme, decision scheme, reward function, and innovative training method. The contribution of advanced algorithms to the training effect is shown, the perception and control schemes in the literature are analyzed in detail, different reward function settings are classified, and innovative training methods with their roles are elaborated. Finally, by comparing the development routes of RL and RL-EMS, this paper identifies the gap between advanced RL solutions and existing RL-EMS. Finally, this paper suggests potential development directions for implementing advanced artificial intelligence (AI) solutions in EMS.

translated by 谷歌翻译

DeF-DReL: Systematic Deployment of Serverless Functions in Fog and Cloud environments using Deep Reinforcement Learning

Chinmaya Kumar Dehury , Shivananda Poojara , Shridhar Domanal , Satish Narayana Srirama

分类：人工智能

2021-10-29

通过将云资源转换为用户的邻近来减轻云计算所拥有的限制来引入雾计算。雾环境使其有限的资源可用于大量用户部署其无服务器的应用程序，由多个无服务器功能组成。引入迷雾环境背后的主要意图是通过其有限的资源来满足延迟和位置敏感无服务器应用程序的需求。最近的研究主要侧重于将最大资源分配给来自FOG节点的这些应用程序，而不是充分利用云环境。这引入了在将资源提供给最大连接用户的负面影响。为了解决此问题，在本文中，我们调查了用户请求的最佳百分比，该请求应由雾和云实现。因此，我们提出了Def-Driel，系统地部署了使用深度增强学习的雾和云环境中无服务器功能，使用若干现实生活参数，例如来自附近FOG节点，用户的优先级的用户的距离和延迟，与最近的相关算法相比，无服务器应用程序的优先级及其资源需求等。从模拟和比较结果，可以清楚地观察到其对其他算法的优势及其对现实生活场景的适用性。

translated by 谷歌翻译

Digital Twin-Assisted Efficient Reinforcement Learning for Edge Task Scheduling

Xiucheng Wang , Longfei Ma , Haocheng Li , Zhisheng Yin , Tom. Luan , Nan Cheng

分类：机器学习 | 人工智能

2022-08-02

当一个用户将多个不同的任务卸载到边缘服务器时，任务调度是一个关键问题。当用户有多个任务要卸载，并且一次只能将一个任务传输到服务器，而服务器根据传输顺序处理任务时，问题是NP-HARD。但是，传统优化方法很难快速获得最佳解决方案，而基于强化学习面孔的方法和过度的动作空间和缓慢收敛的挑战。在本文中，我们提出了一种基于RL的Digital Twin（DT）辅助任务调度方法，以提高RL的性能和收敛性。我们使用DT来模拟代理商做出的不同决策的结果，以便一个代理可以一次尝试多个操作，或者类似地，多个代理可以在DT中并行与环境交互。通过这种方式，RL的勘探效率可以通过DT显着提高，因此RL可以更快地收敛，而局部最优性不太可能发生。特别是，设计了两种算法来制定任务调度决策，即DT辅助异步Q学习（DTAQL）和DT辅助探索Q-Learning（DTEQL）。仿真结果表明，两种算法都通过提高勘探效率显着提高了Q学习的收敛速度。

translated by 谷歌翻译

Multi-Objective Provisioning of Network Slices using Deep Reinforcement Learning

Chien-Cheng Wu , Vasilis Friderikos1 , Cedomir Stefanovic

分类：机器学习

2022-07-27

网络切片（NS）对于有效启用下一代网络中的发散网络应用至关重要。尽管如此，网络服务中的复杂服务质量（QoS）要求和多样性的异质性需要网络切片供应（NSP）优化的高计算时间。传统优化方法在满足网络应用程序的低潜伏期和高可靠性方面具有挑战性。为此，我们将实时NSP建模为在线网络切片配置（ONSP）问题。具体而言，我们将ONSP问题作为在线多目标整数编程优化（MOIPO）问题。然后，我们通过将近端策略优化（PPO）方法应用于交通需求预测来近似于Moipo问题的解决方案。我们的仿真结果表明，与最先进的Moipo求解器相比，该方法的有效性具有较低的SLA违规率和网络操作成本。

translated by 谷歌翻译

Deep Learning-Driven Edge Video Analytics: A Survey

Renjie Xu , Saiedeh Razavi , Rong Zheng

分类：计算机视觉 | 机器学习

2022-11-28

Video, as a key driver in the global explosion of digital information, can create tremendous benefits for human society. Governments and enterprises are deploying innumerable cameras for a variety of applications, e.g., law enforcement, emergency management, traffic control, and security surveillance, all facilitated by video analytics (VA). This trend is spurred by the rapid advancement of deep learning (DL), which enables more precise models for object classification, detection, and tracking. Meanwhile, with the proliferation of Internet-connected devices, massive amounts of data are generated daily, overwhelming the cloud. Edge computing, an emerging paradigm that moves workloads and services from the network core to the network edge, has been widely recognized as a promising solution. The resulting new intersection, edge video analytics (EVA), begins to attract widespread attention. Nevertheless, only a few loosely-related surveys exist on this topic. A dedicated venue for collecting and summarizing the latest advances of EVA is highly desired by the community. Besides, the basic concepts of EVA (e.g., definition, architectures, etc.) are ambiguous and neglected by these surveys due to the rapid development of this domain. A thorough clarification is needed to facilitate a consensus on these concepts. To fill in these gaps, we conduct a comprehensive survey of the recent efforts on EVA. In this paper, we first review the fundamentals of edge computing, followed by an overview of VA. The EVA system and its enabling techniques are discussed next. In addition, we introduce prevalent frameworks and datasets to aid future researchers in the development of EVA systems. Finally, we discuss existing challenges and foresee future research directions. We believe this survey will help readers comprehend the relationship between VA and edge computing, and spark new ideas on EVA.

translated by 谷歌翻译

Accuracy-Guaranteed Collaborative DNN Inference in Industrial IoT via Deep Reinforcement Learning

Wen Wu , Peng Yang , Weiting Zhang , Conghao Zhou , Xuemin , Shen

分类：人工智能 | 机器学习

2022-12-31

Collaboration among industrial Internet of Things (IoT) devices and edge networks is essential to support computation-intensive deep neural network (DNN) inference services which require low delay and high accuracy. Sampling rate adaption which dynamically configures the sampling rates of industrial IoT devices according to network conditions, is the key in minimizing the service delay. In this paper, we investigate the collaborative DNN inference problem in industrial IoT networks. To capture the channel variation and task arrival randomness, we formulate the problem as a constrained Markov decision process (CMDP). Specifically, sampling rate adaption, inference task offloading and edge computing resource allocation are jointly considered to minimize the average service delay while guaranteeing the long-term accuracy requirements of different inference services. Since CMDP cannot be directly solved by general reinforcement learning (RL) algorithms due to the intractable long-term constraints, we first transform the CMDP into an MDP by leveraging the Lyapunov optimization technique. Then, a deep RL-based algorithm is proposed to solve the MDP. To expedite the training process, an optimization subroutine is embedded in the proposed algorithm to directly obtain the optimal edge computing resource allocation. Extensive simulation results are provided to demonstrate that the proposed RL-based algorithm can significantly reduce the average service delay while preserving long-term inference accuracy with a high probability.

translated by 谷歌翻译