The quantum state generation problem is a major research goal of quantum control and quantum variational algorithms, which use iterative optimization methods to evolve the initial state to the target state. The Twin Delayed Deep Deterministic Policy (TD3) algorithm in reinforcement learning achieves high learning efficiency and better stability in continuous control tasks. Here, using the TD3, we propose a new quantum state preparation method that does not require case-by-case optimization, to find a suitable evolution path to obtain the desired state. Specifically, we input the initial state into the trained actor-network, which can output the parameters of the unitary gates step by step, thus gradually evolving the initial state to the fixed quantum state. According to the reversibility of the unitary transformation, we can obtain a sequence of unitary gates to evolve the fixed state to the desired state. To verify the effectiveness of the algorithm, we perform simulations for one-qubit, two-qubit, and four-qubit cases, and the results show that the trained actor-network can provide appropriate unitary transformations to obtain the fixed state.