Functional electrical stimulation (FES) is an effective technology in post-stroke rehabilitation of the upper limbs. Because of the complexity of the system, traditional linear controllers are still far to drive accurate and natural movements. In this work, we apply reinforcement learning (RL) to design a nonlinear controller for an upper limb FES system combined with a passive exoskeleton. RL methods learn by interacting with the environment and, to efficiently use the collected data, we simulated large numbers of experience episodes through artificial neural network (ANN) models of the electrically stimulated arm muscles. The performance of the novel control solution was compared to a PID controller on five healthy subjects during planar reaching tasks. Both controllers correctly drove the arm at the target position, with a mean absolute error < 1°. The RL control significantly outperformed the PID in terms of setting time, position accuracy and smoothness. Future trials are needed to confirm these promising results.