In wireless system, unmanned aerial vehicle (UAV) can play an important role as it can be deployed flexibly to help improve coverage and quality of communication. In this paper, we investigate how a UAV acts as a wireless base station to provide services for a post-disaster area with unknown user distribution. Since the users have different requirements for quality of service, users need to offload some computing tasks to the UAV for processing. Meanwhile, the UAV needs to make an optimal resource decision to meet the requirements of users. By combining optimization of task offloading, bandwidth resource allocation, ground users’ local computing resource allocation and UAV computing resource allocation, an algorithm based on deep deterministic policy gradient (DDPG) in reinforcement learning (RL) was proposed to minimize the ground users and the UAV energy consumption. Simulation results demonstrate that the DDPG-based resource management scheme can converge at about 100 episodes and compared with the baseline algorithms, the proposed algorithm performs significantly better in terms of energy consumption in post-disaster areas where the distribution of location users is unknown.