To address the problem of multiple unmanned aerial vehicles (UAVs) cooperatively pursuing and intercepting non-cooperative UAVs in complex environments, we propose a prioritized experience replay multi-agent deep deterministic policy gradient (PERMADDPG) algorithm based on deep reinforcement learning. By designing algorithm models, state spaces, action spaces, reward functions, and incorporating prioritized experience replay techniques, the UAVs are trained to learn pursuit and interception strategies within a centralized training and distributed execution framework. The algorithm expands research on scenario where the “escaping UAVs have superior speed and maneuverability compared to the pursuing UAVs’’ and alleviates the large amplitude and slow convergence of the MADDPG algorithm when facing complex environments. Finally, through simulation experiments, the effectiveness of the proposed algorithm in solving the many-to-many UAV pursuit and interception problem is verified. By comparing it with the traditional MADDPG algorithm, the proposed algorithm demonstrates the ability of improving pursuit success rate and reducing response time effectively, while converging faster and minimizing oscillation amplitude as well.