A Multi-Agent Reinforcement Learning Approach for Safe and Efficient Behavior Planning of Connected Autonomous Vehicles
- Resource Type
- Periodical
- Authors
- Han, S.; Zhou, S.; Wang, J.; Pepin, L.; Ding, C.; Fu, J.; Miao, F.
- Source
- IEEE Transactions on Intelligent Transportation Systems IEEE Trans. Intell. Transport. Syst. Intelligent Transportation Systems, IEEE Transactions on. 25(5):3654-3670 May, 2024
- Subject
- Transportation
Aerospace
Communication, Networking and Broadcast Technologies
Computing and Processing
Robotics and Control Systems
Signal Processing and Analysis
Safety
Behavioral sciences
Planning
Autonomous vehicles
Training
Reinforcement learning
Aerospace electronics
Autonomous vehicle
multi-agent reinforcement learning
convolutional neural network
control barrier function
- Language
- ISSN
- 1524-9050
1558-0016
The recent advancements in wireless technology enable connected autonomous vehicles (CAVs) to gather information about their environment by vehicle-to-vehicle (V2V) communication. In this work, we design an information-sharing-based multi-agent reinforcement learning (MARL) framework for CAVs, to take advantage of the extra information when making decisions to improve traffic efficiency and safety. The safe actor-critic algorithm we propose has two new techniques: the truncated $\mathcal {Q}$ -function and safe action mapping. The truncated $\mathcal {Q}$ -function utilizes the shared information from neighboring CAVs such that the joint state and action spaces of the $\mathcal {Q}$ -function do not grow in our algorithm for a large-scale CAV system. We prove the bound of the approximation error between the truncated- $\mathcal {Q}$ and global $Q$ -functions. The safe action mapping provides a provable safety guarantee for both the training and execution based on control barrier functions. Using the CARLA simulator for experiments, we show that our approach improves the CAV system’s efficiency in terms of average velocity and comfort under different CAV ratios and different traffic densities. We also show that our approach avoids the execution of unsafe actions and always maintains a safe distance from other vehicles. We construct an obstacle-at-corner scenario to show that the shared vision can help CAVs to observe obstacles earlier and take action to avoid traffic jams. The experiment video is on https://songyanghan.github.io/cavmarl/.