In this paper, a parallel reinforcement learning control scheme is developed for discrete-time Markov jump systems. Attention is focused on how to develop the optimal controller without utilizing the system dynamics. Within the framework of policy iteration and value iteration, we first develop two offline parallel learning algorithms, where accurate system models are involved. Moreover, two novel online parallel model-free learning algorithms are proposed by employing the reinforcement learning scheme. Then, the effectiveness of the developed learning scheme is verified by an inverted pendulum system model.