Temporal difference (TD) prediction error signal models are instrumental in simulating brain function during reinforcement learning (RL). Recent evidence suggests a significant role of TD prediction error signals in the action-selection and action-execution brain networks. We introduce a novel neuro-computational model that addresses the effects of temporal difference error signal variations on reinforcement learning for action-selection and action-execution networks. These networks represent the basal ganglia and prefrontal cortex brain regions, while the TD prediction error signal represents the dopamine neurotransmitter. The model incorporates dopamine genetic parameters in the two networks (COMT gene for action-selection; DAT1 gene for action-execution) to generate four different parameter combinations. The model simulation showed that TD signaling in both networks plays a significant role in RL under optimal conditions of medium, not high, TD signals. Moreover, each parameter combination showed a unique pattern of RL, corresponding with experimental data obtained using a computer-based RL task.