Deep Reinforcement Learning for Scheduling Uplink IoT Traffic with Strict Deadlines
- Resource Type
- Conference
- Authors
- Robaglia, Benoit-Marie; Destounis, Apostolos; Coupechoux, Marceau; Tsilimantos, Dimitrios
- Source
- 2021 IEEE Global Communications Conference (GLOBECOM) Global Communications Conference, (GLOBECOM) 2021 IEEE. :1-6 Dec, 2021
- Subject
- Communication, Networking and Broadcast Technologies
Performance evaluation
Schedules
Recurrent neural networks
Optimal scheduling
Reinforcement learning
Traffic control
Scheduling
Multiple Access
Reinforcement Learning
Proximal Policy Optimization
POMDP
Internet of Things
Wireless sensor networks
scheduling
- Language
This paper considers the Multiple Access problem where $N$ Internet of Things (IoT) devices share a common wireless medium towards a central Base Station (BS). We propose a Reinforcement Learning (RL) method where the BS is the agent and the devices are part of the environment. A device is allowed to transmit only when the BS decides to schedule it. Besides the information packets, devices send additional messages like the delay or the number of discarded packets since their last transmission. This information is used to design the RL reward function and constitutes the next observation that the agent can use to schedule the next device. Leveraging RL allows us to learn the sporadic and heterogeneous traffic patterns of the IoT devices and an optimal scheduling policy that maximizes the channel throughput. We adapt the Proximal Policy Optimization (PPO) algorithm with a Recurrent Neural Network (RNN) to handle the partial observability of our problem and exploit the temporal correlations of the users' traffic. We demonstrate the performance of our model through simulations on different number of heterogeneous devices with periodic traffic and individual latency constraints. We show that our RL algorithm outperforms traditional scheduling schemes and distributed medium access algorithms.