This article considers a wireless network consisting of unmanned aerial vehicles (UAVs), deployed as aerial base stations, and a large number of terrestrial users randomly distributed in a dense urban area. The main objective of this work is to maximize the downlink rate of users along with clustering of users and 2-D initial placement of UAVs, which effectively minimizes the clustering error. To achieve this goal, we estimate the next users’ locations with deep echo-state network (ESN) to find the movement pattern of users with high accuracy. Then, we propose the single- and multiagent actor–critic (AC) algorithms for UAVs’ initial deployment and trajectory design, where the multiagent scheme employs an efficient bandwidth allocation. Simulation results supported by a real data set of the terrestrial users’ coordinates indicate that, when the deep ESN algorithm is used, the accuracy is 93.75% for longitude and 88.36% for latitude compared to the simple ESN performance. Moreover, the use of single- and multiagent AC algorithms display better performance in terms of downlink rate and convergence speed than value-based algorithms such as deep $Q$ -network schemes.