With the rapid development of 5G and the Internet of Things, billions of smart devices will be connected to the network, the Internet will be more heterogeneous and complex, and network traffic will further increase. How to better manage queues and reduce congestion while meeting users' requirements for low network latency and high throughput is an urgent problem that needs to be solved. The traditional AQM algorithm adjusts the packet drop probability according to the current and previous network traffic intensity, network load, queue length, queuing delay and other factors. In the face of a network environment with drastic changes in network traffic, the shortcomings of its relative lag and difficulty in responding quickly to traffic changes are more obvious, resulting in an increase in the number of congestion occurrences, an increase in the packet loss rate of the link, and difficulty in ensuring the utilization rate. This paper proposes an active queue management algorithm QP-AQM algorithm based on Q-learning traffic predictor. It uses Markov decision process to model network traffic, and uses improved Q-Learning algorithm to predict network traffic, then convert the traffic prediction result into the prediction value of the average queue length, and use the prediction result to adaptively modify the parameters in the ARED algorithm, which solves the problem of poor performance of the ARED algorithm when dealing with highly congested links, and further improves the AQM algorithm. throughput and latency performance.