A deep reinforcement learning (DRL) control strategy is proposed for automatic generation control (AGC) operation of the combined heat and power (CHP) units under varying power system dynamic conditions. Firstly, a Markov decision process (MDP) model and a deep deterministic strategy (DDPG) algorithm are used to enhance the load control stability of the CHP units. Secondly, a reward and punishment mechanism is proposed to ensure the control performances of the main steam pressure, the power output, and the extraction pressure in CHP units. Due to the rapid adaptability of DRL, the stability time and the fluctuation range of these control performances are significantly reduced when the AGC disturbance occurs. Finally, the control effect of DRL has more superiority compared with the traditional PID feedback control strategy.