Reinforcement learning has been used to solve sequential decision-making problems in intelligent systems. However, current RL approaches suffer from slow convergence and reward sparsity, and its reward mechanism is challenging to deal with complex task specifications. As one of the software engineering practices, temporal logic can describe nonMarkovian task specifications, the synthesized strategy of which could be used as a priori knowledge to train the agents to interact with the environment efficiently. This paper considers the intelligent agent reacts to the environment with a high-level reactive temporal logic specification called Generalized Reactivity of rank 1 (GR(1)). We first use the synthesized strategy of GR(1) to construct the Markov Decision Process with a potentialbased reward machine, which integrates the environment with high-level reactive temporal specifications. Then we developed a topological-sort-based reward shaping approach to calculate the potential functions of the reward machine, based on which we used Q-learning to train the agents. Experiments on multitask learning show that the proposed approach outperforms the state-of-art algorithms in learning rate and optimal rewards. Also, compared with the value-iteration-based reward shaping approaches, our topological-sort-based reward shaping approach could handle the cases where the synthesized strategies are in the form of directed cyclic graphs.