Based on the communication-adjustment mechanism and the introduction of artificial intelligence Q-learning algorithm, a reinforcement learning-based beam tracking algorithm is proposed for the directional UAV Ad Hoc Network, which is prone to interruption of the directional beam link due to the high dynamics of the nodes. Firstly, the beam tracking problem is introduced and analyzed, and the problem that the beam adjustment lags behind the relative motion of nodes in the communication-adjustment mechanism is pointed out. Secondly, to address the above problem, a reinforcement learning algorithm is introduced to model the beam tracking problem in the directional UAV Ad Hoc Network, in which the UAV nodes are treated as agents, adjustment of nodes beam pointing is treated as the action space, relative motion of the nodes and the beam alignment effect are modeled as the state space. And the nodes are made to update the Q-table and adjust the target beam pointing based on time differential learning during the network operation. Finally, the beam tracking effects of the benchmark method and the Q-learning method are simulated and analyzed under different conditions, and the results show that the Q-learning method outperforms the benchmark method in terms of beam alignment accuracy and communication success rate.