Motion planning for many unmanned aircraft is challenging because they have a larger configuration space than self-driving automobile development (Automated guided vehicles). Additionally, there are more significant uncertainties and disruptions in UAV environments, which makes inter automatic navigation more difficult. In this letter, I proposed a 2 recurrent neural networks (RL) oriented multi-UAV collision avoidance technique by modeling the unpredictability or noise in the environment. Our objective is to create a strategy that can construct a path without clashing with anything using local noisy data. But unlike supervised algorithms, RL lacks a steady training data set with floor labels, thus its collision avoidance strategies often show significant fluctuation and are difficult to reproduce. To solve these issues, we created a two-stage training method for RL-based collision avoidance. We first optimize the policy, and then we utilize a supervised learning methodology with a loss function that encourages the agents to adopt the well-known reciprocity obstacle detection mechanism. In the second step, we use transmission to fine-tune the policy. The complete computer simulation findings demonstrate that this approach can handle noisy local views with erratic sound levels and can design moment & accident paths under inadequate sensing. We review the impact of our policies in a variety of ways.