智能论文笔记

Scheduling Out-of-Coverage Vehicular Communications Using Reinforcement Learning

Taylan Şahin , Ramin Khalili , Mate Boban , Adam Wolisz

分类：人工智能

2022-07-13

车辆到车辆（V2V）通信的性能在很大程度上取决于使用的调度方法。虽然集中式网络调度程序提供高V2V通信可靠性，但它们的操作通常仅限于具有完整的蜂窝网络覆盖范围的区域。相比之下，在细胞外覆盖区域中，使用了相对效率低下的分布式无线电资源管理。为了利用集中式方法的好处来增强V2V通信在缺乏蜂窝覆盖的道路上的可靠性，我们建议使用VRLS（车辆加固学习调度程序），这是一种集中的调度程序，该调度程序主动为覆盖外的V2V Communications主动分配资源，以前}车辆离开蜂窝网络覆盖范围。通过在模拟的车辆环境中进行培训，VRL可以学习一项适应环境变化的调度策略，从而消除了在复杂的现实生活环境中对有针对性（重新）培训的需求。我们评估了在不同的移动性，网络负载，无线通道和资源配置下VRL的性能。 VRL的表现优于最新的区域中最新分布式调度算法，而无需蜂窝网络覆盖，通过在高负载条件下将数据包错误率降低了一半，并在低负载方案中实现了接近最大的可靠性。

translated by 谷歌翻译

Applications of Multi-Agent Reinforcement Learning in Future Internet: A Comprehensive Survey

Tianxu Li , Kun Zhu , Nguyen Cong Luong , Dusit Niyato , Qihui Wu , Yang Zhang , Bing Chen

分类：人工智能 | 机器学习

2021-10-26

未来的互联网涉及几种新兴技术，例如5G和5G网络，车辆网络，无人机（UAV）网络和物联网（IOT）。此外，未来的互联网变得异质并分散了许多相关网络实体。每个实体可能需要做出本地决定，以在动态和不确定的网络环境下改善网络性能。最近使用标准学习算法，例如单药强化学习（RL）或深入强化学习（DRL），以使每个网络实体作为代理人通过与未知环境进行互动来自适应地学习最佳决策策略。但是，这种算法未能对网络实体之间的合作或竞争进行建模，而只是将其他实体视为可能导致非平稳性问题的环境的一部分。多机构增强学习（MARL）允许每个网络实体不仅观察环境，还可以观察其他实体的政策来学习其最佳政策。结果，MAL可以显着提高网络实体的学习效率，并且最近已用于解决新兴网络中的各种问题。在本文中，我们因此回顾了MAL在新兴网络中的应用。特别是，我们提供了MARL的教程，以及对MARL在下一代互联网中的应用进行全面调查。特别是，我们首先介绍单代机Agent RL和MARL。然后，我们回顾了MAL在未来互联网中解决新兴问题的许多应用程序。这些问题包括网络访问，传输电源控制，计算卸载，内容缓存，数据包路由，无人机网络的轨迹设计以及网络安全问题。

translated by 谷歌翻译

Vehicular Cooperative Perception Through Action Branching and Federated Reinforcement Learning

Mohamed K. Abdel-Aziz , Cristina Perfecto , Sumudu Samarakoon , Mehdi Bennis , Walid Saad

分类：机器学习

2020-12-07

合作的感知在将车辆的感知范围扩展到超出其视线之外至关重要。然而，在有限的通信资源下交换原始感官数据是不可行的。为了实现有效的合作感知，车辆需要解决以下基本问题：需要共享哪些感官数据？，在哪个分辨率？，以及哪个车辆？为了回答这个问题，在本文中，提出了一种新颖的框架来允许加强学习（RL）基于车辆关联，资源块（RB）分配和通过利用基于四叉的点的协作感知消息（CPM）的内容选择云压缩机制。此外，引入了联合的RL方法，以便在跨车辆上加速训练过程。仿真结果表明，RL代理能够有效地学习车辆关联，RB分配和消息内容选择，同时在接收的感官信息方面最大化车辆的满足。结果还表明，与非联邦方法相比，联邦RL改善了培训过程，可以在与非联邦方法相同的时间内实现更好的政策。

translated by 谷歌翻译

Device Selection for the Coexistence of URLLC and Distributed Learning Services

Milad Ganjalizadeh , Hossein Shokri Ghadikolaei , Deniz Gündüz , Marina Petrova

分类：机器学习

2022-12-22

Recent advances in distributed artificial intelligence (AI) have led to tremendous breakthroughs in various communication services, from fault-tolerant factory automation to smart cities. When distributed learning is run over a set of wirelessly connected devices, random channel fluctuations and the incumbent services running on the same network impact the performance of both distributed learning and the coexisting service. In this paper, we investigate a mixed service scenario where distributed AI workflow and ultra-reliable low latency communication (URLLC) services run concurrently over a network. Consequently, we propose a risk sensitivity-based formulation for device selection to minimize the AI training delays during its convergence period while ensuring that the operational requirements of the URLLC service are met. To address this challenging coexistence problem, we transform it into a deep reinforcement learning problem and address it via a framework based on soft actor-critic algorithm. We evaluate our solution with a realistic and 3GPP-compliant simulator for factory automation use cases. Our simulation results confirm that our solution can significantly decrease the training delay of the distributed AI service while keeping the URLLC availability above its required threshold and close to the scenario where URLLC solely consumes all network resources.

translated by 谷歌翻译

Multi-Agent Reinforcement Learning for Channel Assignment and Power Allocation in Platoon-Based C-V2X Systems

Hung V. Vu , Mohammad Farzanullah , Zheyu Liu , Duy H. N. Nguyen , Robert Morawski , Tho Le-Ngoc

分类：机器学习

2020-11-09

我们考虑了在透明的蜂窝车辆到所有物品（C-V2X）系统中的联合渠道分配和电力分配的问题，其中多个车辆到网络（V2N）上行链路共享与多个车辆到车辆的时频资源（ v2v）排，使连接和自动驾驶汽车的团体可以紧密地一起旅行。由于在车辆环境中使用高用户移动性的性质，依赖全球渠道信息的传统集中优化方法在具有大量用户的C-V2X系统中可能不可行。利用多机构增强学习（RL）方法，我们提出了分布式资源分配（RA）算法来克服这一挑战。具体而言，我们将RA问题建模为多代理系统。仅基于本地渠道信息，每个排领导者充当代理，共同相互交互，因此选择了子频段和功率水平的最佳组合来传输其信号。为此，我们利用双重Q学习算法在同时最大化V2N链接的总和率的目标下共同训练代理，并满足所需延迟限制的每个V2V链接的数据包输送概率。仿真结果表明，与众所周知的详尽搜索算法相比，我们提出的基于RL的算法提供了紧密的性能。

translated by 谷歌翻译

Distributed Machine Learning for UAV Swarms: Computing, Sensing, and Semantics

Yahao Ding , Zhaohui Yang , Quoc-Viet Pham , Zhaoyang Zhang , Mohammad Shikh-Bahaei

分类：机器学习 | 人工智能

2023-01-03

Unmanned aerial vehicle (UAV) swarms are considered as a promising technique for next-generation communication networks due to their flexibility, mobility, low cost, and the ability to collaboratively and autonomously provide services. Distributed learning (DL) enables UAV swarms to intelligently provide communication services, multi-directional remote surveillance, and target tracking. In this survey, we first introduce several popular DL algorithms such as federated learning (FL), multi-agent Reinforcement Learning (MARL), distributed inference, and split learning, and present a comprehensive overview of their applications for UAV swarms, such as trajectory design, power control, wireless resource allocation, user assignment, perception, and satellite communications. Then, we present several state-of-the-art applications of UAV swarms in wireless communication systems, such us reconfigurable intelligent surface (RIS), virtual reality (VR), semantic communications, and discuss the problems and challenges that DL-enabled UAV swarms can solve in these applications. Finally, we describe open problems of using DL in UAV swarms and future research directions of DL enabled UAV swarms. In summary, this survey provides a comprehensive survey of various DL applications for UAV swarms in extensive scenarios.

translated by 谷歌翻译

Learning Emergent Random Access Protocol for LEO Satellite Networks

Ju-Hyung Lee , Hyowoon Seo , Jihong Park , Mehdi Bennis , Young-Chai Ko

分类：机器学习

2021-12-03

设想了一座低空地球轨道（LEO）卫星（SAT）的Mega-Constulation，以提供超出第五代（5G）蜂窝系统的全球覆盖网网络。 Leo SAT网络在时代的SAT网络拓扑中展示了许多用户的极长链接距离。这使得现有的多个访问协议，例如基于随机接入信道（RACH）的蜂窝协议，专为固定地面网络拓扑而设计，不适用于。为了克服这个问题，在本文中，我们提出了一种新颖的LEO SAT网络无随机访问解决方案，被称为随机接入信道协议（ERACH）。在与现有的基于模型和标准化协议的鲜明对比中，ERACH是一种无模型方法，通过使用多档次深度加强学习（Madrl），通过与非静止网络环境的互动出现。此外，通过利用已知的SAT轨道模式，ERACH不需要跨越用户的中心协调或额外的通信，而训练会聚通过规则的轨道模式稳定。与RACH相比，我们从各种模拟中展示了我们所提出的ERACH的平均网络吞吐量增加了54.6％，平均访问延迟较低的两倍，同时实现了0.989的jain的公平指数。

translated by 谷歌翻译

UAV-Assisted Space-Air-Ground Integrated Networks: A Technical Review of Recent Learning Algorithms

Atefeh H. Arani , Peng Hu , Yeying Zhu

分类：机器学习

2022-11-27

Recent technological advancements in space, air and ground components have made possible a new network paradigm called "space-air-ground integrated network" (SAGIN). Unmanned aerial vehicles (UAVs) play a key role in SAGINs. However, due to UAVs' high dynamics and complexity, the real-world deployment of a SAGIN becomes a major barrier for realizing such SAGINs. Compared to the space and terrestrial components, UAVs are expected to meet performance requirements with high flexibility and dynamics using limited resources. Therefore, employing UAVs in various usage scenarios requires well-designed planning in algorithmic approaches. In this paper, we provide a comprehensive review of recent learning-based algorithmic approaches. We consider possible reward functions and discuss the state-of-the-art algorithms for optimizing the reward functions, including Q-learning, deep Q-learning, multi-armed bandit (MAB), particle swarm optimization (PSO) and satisfaction-based learning algorithms. Unlike other survey papers, we focus on the methodological perspective of the optimization problem, which can be applicable to various UAV-assisted missions on a SAGIN using these algorithms. We simulate users and environments according to real-world scenarios and compare the learning-based and PSO-based methods in terms of throughput, load, fairness, computation time, etc. We also implement and evaluate the 2-dimensional (2D) and 3-dimensional (3D) variations of these algorithms to reflect different deployment cases. Our simulation suggests that the $3$D satisfaction-based learning algorithm outperforms the other approaches for various metrics in most cases. We discuss some open challenges at the end and our findings aim to provide design guidelines for algorithm selections while optimizing the deployment of UAV-assisted SAGINs.

translated by 谷歌翻译

Decentralized Federated Reinforcement Learning for User-Centric Dynamic TFDD Control

Ziyan Yin , Zhe Wang , Jun Li , Ming Ding , Wen Chen , Shi Jin

分类：机器学习

2022-11-04

The explosive growth of dynamic and heterogeneous data traffic brings great challenges for 5G and beyond mobile networks. To enhance the network capacity and reliability, we propose a learning-based dynamic time-frequency division duplexing (D-TFDD) scheme that adaptively allocates the uplink and downlink time-frequency resources of base stations (BSs) to meet the asymmetric and heterogeneous traffic demands while alleviating the inter-cell interference. We formulate the problem as a decentralized partially observable Markov decision process (Dec-POMDP) that maximizes the long-term expected sum rate under the users' packet dropping ratio constraints. In order to jointly optimize the global resources in a decentralized manner, we propose a federated reinforcement learning (RL) algorithm named federated Wolpertinger deep deterministic policy gradient (FWDDPG) algorithm. The BSs decide their local time-frequency configurations through RL algorithms and achieve global training via exchanging local RL models with their neighbors under a decentralized federated learning framework. Specifically, to deal with the large-scale discrete action space of each BS, we adopt a DDPG-based algorithm to generate actions in a continuous space, and then utilize Wolpertinger policy to reduce the mapping errors from continuous action space back to discrete action space. Simulation results demonstrate the superiority of our proposed algorithm to benchmark algorithms with respect to system sum rate.

translated by 谷歌翻译

Wireless for Machine Learning

Henrik Hellström , José Mairton B. da Silva Jr , Mohammad Mohammadi Amiri , Mingzhe Chen , Viktoria Fodor , H. Vincent Poor , Carlo Fischione

分类：机器学习

2020-08-31

随着数据生成越来越多地在没有连接连接的设备上进行，因此与机器学习（ML）相关的流量将在无线网络中无处不在。许多研究表明，传统的无线协议高效或不可持续以支持ML，这创造了对新的无线通信方法的需求。在这项调查中，我们对最先进的无线方法进行了详尽的审查，这些方法是专门设计用于支持分布式数据集的ML服务的。当前，文献中有两个明确的主题，模拟的无线计算和针对ML优化的数字无线电资源管理。这项调查对这些方法进行了全面的介绍，回顾了最重要的作品，突出了开放问题并讨论了应用程序方案。

translated by 谷歌翻译

A Comprehensive Survey on the Convergence of Vehicular Social Networks and Fog Computing

Farimasadat Miri , Richard Pazzi

分类：人工智能

2021-11-30

近年来，物联网设备的数量越来越快，这导致了用于管理，存储，分析和从不同物联网设备的原始数据做出决定的具有挑战性的任务，尤其是对于延时敏感的应用程序。在车辆网络（VANET）环境中，由于常见的拓扑变化，车辆的动态性质使当前的开放研究发出更具挑战性，这可能导致车辆之间断开连接。为此，已经在5G基础设施上计算了云和雾化的背景下提出了许多研究工作。另一方面，有多种研究提案旨在延长车辆之间的连接时间。已经定义了车辆社交网络（VSN）以减少车辆之间的连接时间的负担。本调查纸首先提供了关于雾，云和相关范例，如5G和SDN的必要背景信息和定义。然后，它将读者介绍给车辆社交网络，不同的指标和VSN和在线社交网络之间的主要差异。最后，本调查调查了在展示不同架构的VANET背景下的相关工作，以解决雾计算中的不同问题。此外，它提供了不同方法的分类，并在雾和云的上下文中讨论所需的指标，并将其与车辆社交网络进行比较。与VSN和雾计算领域的新研究挑战和趋势一起讨论了相关相关工程的比较。

translated by 谷歌翻译

Programmable and Customized Intelligence for Traffic Steering in 5G Networks Using Open RAN Architectures

Andrea Lacava , Michele Polese , Rajarajan Sivaraj , Rahul Soundrarajan , Bhawani Shanker Bhati , Tarunjeet Singh , Tommaso Zugno , Francesca Cuomo , Tommaso Melodia

分类：人工智能

2022-09-28

5G及以后的移动网络将以前所未有的规模支持异质用例，从而要求自动控制和优化针对单个用户需求的网络功能。当前的蜂窝体系结构不可能对无线电访问网络（RAN）进行这种细粒度控制。为了填补这一空白，开放式运行范式及其规范引入了一个带有抽象的开放体系结构，该架构可以启用闭环控制并提供数据驱动和智能优化RAN在用户级别上。这是通过在网络边缘部署在近实时RAN智能控制器（接近RT RIC）上的自定义RAN控制应用程序（即XAPP）获得的。尽管有这些前提，但截至今天，研究界缺乏用于构建数据驱动XAPP的沙箱，并创建大型数据集以有效的AI培训。在本文中，我们通过引入NS-O-RAN来解决此问题，NS-O-RAN是一个软件框架，该框架将现实世界中的生产级近距离RIC与NS-3上的基于3GPP的模拟环境集成在一起，从而实现了XAPPS和XAPPS的开发自动化的大规模数据收集和深入强化学习驱动的控制策略的测试，以在用户级别的优化中进行优化。此外，我们提出了第一个特定于用户的O-RAN交通转向（TS）智能移交框架。它使用随机的合奏混合物，结合了最先进的卷积神经网络体系结构，以最佳地为网络中的每个用户分配服务基站。我们的TS XAPP接受了NS-O-RAN收集的超过4000万个数据点的培训，该数据点在近距离RIC上运行，并控制其基站。我们在大规模部署中评估了性能，这表明基于XAPP的交换可以使吞吐量和频谱效率平均比传统的移交启发式方法提高50％，而动机性开销较少。

translated by 谷歌翻译

When Machine Learning Meets Spectrum Sharing Security: Methodologies and Challenges

Qun Wang , Haijian Sun , Rose Qingyang Hu , Arupjyoti Bhuyan

分类：机器学习

2022-01-12

互联网连接系统的指数增长产生了许多挑战，例如频谱短缺问题，需要有效的频谱共享（SS）解决方案。复杂和动态的SS系统可以接触不同的潜在安全性和隐私问题，需要保护机制是自适应，可靠和可扩展的。基于机器学习（ML）的方法经常提议解决这些问题。在本文中，我们对最近的基于ML的SS方法，最关键的安全问题和相应的防御机制提供了全面的调查。特别是，我们详细说明了用于提高SS通信系统的性能的最先进的方法，包括基于ML基于ML的基于的数据库辅助SS网络，ML基于基于的数据库辅助SS网络，包括基于ML的数据库辅助的SS网络，基于ML的LTE-U网络，基于ML的环境反向散射网络和其他基于ML的SS解决方案。我们还从物理层和基于ML算法的相应防御策略的安全问题，包括主要用户仿真（PUE）攻击，频谱感测数据伪造（SSDF）攻击，干扰攻击，窃听攻击和隐私问题。最后，还给出了对ML基于ML的开放挑战的广泛讨论。这种全面的审查旨在为探索新出现的ML的潜力提供越来越复杂的SS及其安全问题，提供基础和促进未来的研究。

translated by 谷歌翻译

Transferable Deep Reinforcement Learning Framework for Autonomous Vehicles with Joint Radar-Data Communications

Nguyen Quang Hieu , Dinh Thai Hoang , Dusit Niyato , Ping Wang , Dong In Kim , Chau Yuen

分类：机器学习 | 机器人

2021-05-28

自动驾驶汽车（AV）必须在动态环境中安全有效地操作。为此，配备联合雷达通信（JRC）功能的AVS可以通过使用雷达检测和数据通信功能来增强驾驶安全性。但是，在不确定性和周围环境的动态下，通过两种不同功能优化AV系统的性能非常具有挑战性。在这项工作中，我们首先提出一个基于马尔可夫决策过程（MDP）的智能优化框架，以帮助AV在周围环境的动态和不确定性下选择JRC操作功能时做出最佳决策。然后，我们开发了一种有效的学习算法，利用了深度强化学习技术的最新进展，以找到AV的最佳政策，而无需任何有关周围环境的先前信息。此外，为了使我们提出的框架更加可扩展，我们开发了一种转移学习（TL）机制，该机制使AV能够利用有价值的体验来加速培训过程，以加速培训过程。广泛的模拟表明，与其他常规的深钢筋学习方法相比，提议的可转移深钢筋学习框架可将AV的障碍检测概率降低到67％。

translated by 谷歌翻译

Beyond 5G Networks: Integration of Communication, Computing, Caching, and Control

Musbahu Mohammed Adam , Liqiang Zhao , Kezhi Wang , Zhu Han

分类：机器学习

2022-12-26

In recent years, the exponential proliferation of smart devices with their intelligent applications poses severe challenges on conventional cellular networks. Such challenges can be potentially overcome by integrating communication, computing, caching, and control (i4C) technologies. In this survey, we first give a snapshot of different aspects of the i4C, comprising background, motivation, leading technological enablers, potential applications, and use cases. Next, we describe different models of communication, computing, caching, and control (4C) to lay the foundation of the integration approach. We review current state-of-the-art research efforts related to the i4C, focusing on recent trends of both conventional and artificial intelligence (AI)-based integration approaches. We also highlight the need for intelligence in resources integration. Then, we discuss integration of sensing and communication (ISAC) and classify the integration approaches into various classes. Finally, we propose open challenges and present future research directions for beyond 5G networks, such as 6G.

translated by 谷歌翻译

Holistic Network Virtualization and Pervasive Network Intelligence for 6G

Xuemin , Shen , Jie Gao , Wen Wu , Mushu Li , Conghao Zhou , Weihua Zhuang

分类：人工智能

2023-01-02

In this tutorial paper, we look into the evolution and prospect of network architecture and propose a novel conceptual architecture for the 6th generation (6G) networks. The proposed architecture has two key elements, i.e., holistic network virtualization and pervasive artificial intelligence (AI). The holistic network virtualization consists of network slicing and digital twin, from the aspects of service provision and service demand, respectively, to incorporate service-centric and user-centric networking. The pervasive network intelligence integrates AI into future networks from the perspectives of networking for AI and AI for networking, respectively. Building on holistic network virtualization and pervasive network intelligence, the proposed architecture can facilitate three types of interplay, i.e., the interplay between digital twin and network slicing paradigms, between model-driven and data-driven methods for network management, and between virtualization and AI, to maximize the flexibility, scalability, adaptivity, and intelligence for 6G networks. We also identify challenges and open issues related to the proposed architecture. By providing our vision, we aim to inspire further discussions and developments on the potential architecture of 6G.

translated by 谷歌翻译

Learning, Computing, and Trustworthiness in Intelligent IoT Environments: Performance-Energy Tradeoffs

Beatriz Soret , Lam D. Nguyen , Jan Seeger , Arne Bröring , Chaouki Ben Issaid , Sumudu Samarakoon , Anis El Gabli , Vivek Kulkarni , Mehdi Bennis , Petar Popovski

分类：人工智能

2021-10-04

智能物联网环境（iiote）由可以协作执行半自动的IOT应用的异构装置，其示例包括高度自动化的制造单元或自主交互收获机器。能量效率是这种边缘环境中的关键，因为它们通常基于由无线和电池运行设备组成的基础设施，例如电子拖拉机，无人机，自动引导车辆（AGV）S和机器人。总能源消耗从多种技术技术汲取贡献，使得能够实现边缘计算和通信，分布式学习以及分布式分区和智能合同。本文提供了本技术的最先进的概述，并说明了它们的功能和性能，特别关注资源，延迟，隐私和能源消耗之间的权衡。最后，本文提供了一种在节能IIOTE和路线图中集成这些能力技术的愿景，以解决开放的研究挑战

translated by 谷歌翻译

Intelligent Resource Allocation in Dense LoRa Networks using Deep Reinforcement Learning

Inaam Ilahi , Muhammad Usama , Muhammad Omer Farooq , Muhammad Umar Janjua , Junaid Qadir

分类：人工智能

2020-12-22

未来几年物联网设备计数的预期增加促使有效算法的开发，可以帮助其有效管理，同时保持功耗低。在本文中，我们提出了一种智能多通道资源分配算法，用于Loradrl的密集Lora网络，并提供详细的性能评估。我们的结果表明，所提出的算法不仅显着提高了Lorawan的分组传递比（PDR），而且还能够支持移动终端设备（EDS），同时确保较低的功耗，因此增加了网络的寿命和容量。}大多数之前作品侧重于提出改进网络容量的不同MAC协议，即Lorawan，传输前的延迟等。我们展示通过使用Loradrl，我们可以通过Aloha \ TextColor {Black}与Lorasim相比，我们可以实现相同的效率LORA-MAB在将复杂性从EDS移动到网关的同时，因此使EDS更简单和更便宜。此外，我们在大规模的频率干扰攻击下测试Loradrl的性能，并显示其对环境变化的适应性。我们表明，与基于学习的技术相比，Loradrl的输出改善了最先进的技术的性能，从而提高了PR的500多种\％。

translated by 谷歌翻译

Federated Meta-Learning for Traffic Steering in O-RAN

Hakan Erdol , Xiaoyang Wang , Peizheng Li , Jonathan D. Thomas , Robert Piechocki , George Oikonomou , Rui Inacio , Abdelrahim Ahmad , Keith Briggs , Shipra Kapoor

分类：机器学习

2022-09-13

与LTE网络相比，5G的愿景在于提供较高的数据速率，低延迟（为了实现近实时应用程序），大大增加了基站容量以及用户的接近完美服务质量（QoS）。为了提供此类服务，5G系统将支持LTE，NR，NR-U和Wi-Fi等访问技术的各种组合。每种无线电访问技术（RAT）都提供不同类型的访问，这些访问应在用户中对其进行最佳分配和管理。除了资源管理外，5G系统还将支持双重连接服务。因此，网络的编排对于系统经理在旧式访问技术方面来说是一个更困难的问题。在本文中，我们提出了一种基于联合元学习（FML）的大鼠分配算法，该算法使RAN Intelligent Controller（RIC）能够更快地适应动态变化的环境。我们设计了一个包含LTE和5G NR服务技术的模拟环境。在模拟中，我们的目标是在传输的截止日期内满足UE需求，以提供更高的QoS值。我们将提出的算法与单个RL试剂，爬行动物算法和基于规则的启发式方法进行了比较。仿真结果表明，提出的FML方法分别在第一部部署回合21％和12％时达到了较高的缓存率。此外，在比较方法中，提出的方法最快地适应了新任务和环境。

translated by 谷歌翻译

Asynchronous Hybrid Reinforcement Learning for Latency and Reliability Optimization in the Metaverse over Wireless Communications

Wenhan Yu , Terence Jie Chua , Jun Zhao

分类：机器学习

2022-12-30

Technology advancements in wireless communications and high-performance Extended Reality (XR) have empowered the developments of the Metaverse. The demand for Metaverse applications and hence, real-time digital twinning of real-world scenes is increasing. Nevertheless, the replication of 2D physical world images into 3D virtual world scenes is computationally intensive and requires computation offloading. The disparity in transmitted scene dimension (2D as opposed to 3D) leads to asymmetric data sizes in uplink (UL) and downlink (DL). To ensure the reliability and low latency of the system, we consider an asynchronous joint UL-DL scenario where in the UL stage, the smaller data size of the physical world scenes captured by multiple extended reality users (XUs) will be uploaded to the Metaverse Console (MC) to be construed and rendered. In the DL stage, the larger-size 3D virtual world scenes need to be transmitted back to the XUs. The decisions pertaining to computation offloading and channel assignment are optimized in the UL stage, and the MC will optimize power allocation for users assigned with a channel in the UL transmission stage. Some problems arise therefrom: (i) interactive multi-process chain, specifically Asynchronous Markov Decision Process (AMDP), (ii) joint optimization in multiple processes, and (iii) high-dimensional objective functions, or hybrid reward scenarios. To ensure the reliability and low latency of the system, we design a novel multi-agent reinforcement learning algorithm structure, namely Asynchronous Actors Hybrid Critic (AAHC). Extensive experiments demonstrate that compared to proposed baselines, AAHC obtains better solutions with preferable training time.

translated by 谷歌翻译