Deep Reinforcement Learning-based Energy Efficiency Optimization for RIS-aided Integrated Satellite-Aerial-Terrestrial Relay Networks
Links to Files
Author/Creator ORCID
Date
Type of Work
Department
Program
Citation of Original Publication
Wu, Min, Kefeng Guo, Xingwang Li, Zhi Lin, Yongpeng Wu, Theodoros A. Tsiftsis, and Houbing Song. "Deep Reinforcement Learning-Based Energy Efficiency Optimization for RIS-Aided Integrated Satellite-Aerial-Terrestrial Relay Networks." IEEE Transactions on Communications, 2024, 1-1. https://doi.org/10.1109/TCOMM.2024.3370618.
Rights
© 2024 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works.
Subjects
Array signal processing
Autonomous aerial vehicles
deep reinforcement learning (DRL)
Heuristic algorithms
Integrated satellite-aerial-terrestrial relay networks (ISATRNs)
mixed FSO/RF mode
NOMA
non-orthogonal multiple access (NOMA)
Optimization
reconfigurable intelligent surface (RIS)
Relay networks
Satellites
Autonomous aerial vehicles
deep reinforcement learning (DRL)
Heuristic algorithms
Integrated satellite-aerial-terrestrial relay networks (ISATRNs)
mixed FSO/RF mode
NOMA
non-orthogonal multiple access (NOMA)
Optimization
reconfigurable intelligent surface (RIS)
Relay networks
Satellites
Abstract
Integrated satellite-aerial-terrestrial relay networks (ISATRNs) have been considered as a promising architecture for next-generation networks, where high altitude platform (HAP) is pivotal in these integrated networks. In this paper, we introduce a novel model for HAP-based ISATRNs with mixed FSO/RF transmission mode, which incorporates unmanned aerial vehicles (UAVs) equipped with reconfigurable intelligent surfaces (RISs) to dynamically reconfigure the propagation environment and fulfill the massive access requirements of ground users. Our aim is to maximize the system ergodic rate by joint optimizing the UAV trajectory, RIS phase shift, and active transmit beamforming matrix under the constraint of UAV energy consumption. To solve this intractable problem, a deep reinforcement learning (DRL)-based energy efficient optimization scheme by utilizing an improved long short-term memory (LSTM)-double deep Q-network (DDQN) framework is proposed. Numerical results demonstrate the superiority of our proposed algorithm over the traditional DDQN algorithm, on single-step exploration average reward values and other evaluation metrics.
