To minimize the makespan of flexible job-shop scheduling problem (F JSP), an end-to-end deep reinforcement learning algorithm (DRL) is proposed. Firstly, the complex constraints and information of F JSP are described by a dynamic heterogeneous graph, through which the process of F JSP can be transformed into a Markov decision process (MDP). Secondly, A two-stage heterogeneous graph neural network (HGNN) is designed to extract workshop state information, and a multilayer perceptron is utilized for end-to-end decision-making. Finally, the model is trained using the proximal policy optimization algorithm (PPO). To evaluate the performance and effectiveness of this algorithm, experiments were conducted on standard test sets and compared with other approximate solution algorithms. The experimental results highlight the superior performance of our algorithm over various priority dispatching rules (PDR) in terms of solution quality. Additionally, our algorithm achieves faster solving speeds compared to Meta-heuristics algorithms.