In this article, we propose a downlink communication scheme for large-scale high-interference unmanned aerial vehicle (UAV) swarm network based on nonorthogonal multiple access (NOMA), clustering, and reinforcement learning (RL). Since a large number of UAVs increases the complexity of downlink communication, we first introduce a load-balancing fuzzy C-Means (LB-FCMs) algorithm for UAV clustering. Downlink communication consists of three stages: 1) UAV clustering; 2) data aggregation; and 3) data offloading. We have two goals: 1) maximize the data aggregation rate of the network while ensuring fairness of UAVs’ spectrum access for UAV-to-UAV (U2U) communications during data aggregation and 2) maximize network data offloading rate while ensuring ground station priority for UAV-to-ground (U2G) communications during data offloading. To address these two problems, first, we introduce uplink NOMA and downlink NOMA to eliminate part of the intrasystem interference, respectively. Then, we propose a multiagent RL framework for optimizing channel, transmit power, and trajectory scheduling (MARL-CPT). MARL-CPT consists of two parts of the algorithm, which solve the optimization problems in two stages, respectively. Simulation results show that our proposed method outperforms random decision-making and polling-based single-agent RL methods in terms of final score, fairness, and priority. For trajectory scheduling during data offloading, our method finds the optimal hover position while taking less than half the time compared to single-agent RL methods.