Distributed Power Allocation for 6-GHz Unlicensed Spectrum Sharing via Multi-agent Deep Reinforcement Learning
- Resource Type
- Conference
- Authors
- Zhang, Xiang; Bhuyan, Arupjyoti; Kasera, Sneha Kumar; Ji, Mingyue
- Source
- 2023 IEEE International Conference on Industrial Technology (ICIT) Industrial Technology (ICIT), 2023 IEEE International Conference on. :1-6 Apr, 2023
- Subject
- Engineering Profession
Power, Energy and Industry Applications
Robotics and Control Systems
Signal Processing and Analysis
Radio frequency
Deep learning
Base stations
Power measurement
Reinforcement learning
Interference
Throughput
- Language
- ISSN
- 2643-2978
We consider the problem of spectrum sharing by multiple cellular operators. We propose a novel deep Reinforcement Learning (DRL)-based distributed power allocation scheme which utilizes the multi-agent Deep Deterministic Policy Gradient (MA-DDPG) algorithm. In particular, we model the base stations (BSs) that belong to the multiple operators sharing the same band, as DRL agents that simultaneously determine the transmit powers to their scheduled user equipment (UE) in a synchronized manner. The power decision of each BS is based on its own observation of the radio environment (RF) environment, which consists of interference measurements reported from the UEs it serves, and a limited amount of information obtained from other BSs. One advantage of the proposed scheme is that it addresses the single-agent non-stationarity problem of RL in the multi-agent scenario by incorporating the actions and observations of other BSs into each BS's own critic which helps it to gain a more accurate perception of the overall RF environment. A centralized-training-distributed-execution framework is used to train the policies where the critics are trained over the joint actions and observations of all BSs while the actor of each BS only takes the local observation as input in order to produce the transmit power. Simulation with the 6 GHz Unlicensed National Information Infrastructure (U-NII)-5 band shows that the proposed power allocation scheme can achieve better throughput performance than several state-of-the-art approaches.