This paper presents a Reinforcement Learning (RL) accelerator using Q-learning algorithm with optimized bit precision. In this work, we perform evaluation of the employed bit width of the data path subject to accuracy of the Q-values. The designed RL accelerator is implementing the Q-Learning algorithm that comprises several blocks: Q-Value memories, Q-Updater, Policy Generator and, Environment block. In addition, we also present the corresponding architecture and implement the design in the FPGA. Experimental results show that the number of bits can be reduced from 32 bits to 16 bits without sacrificing the accuracy. The accuracy can be maintained at around 88% when employing 16 bits data path with 10 bits fraction. Moreover, the designed 16 bits RL accelerator design size offers reduction of LUTs and FFs compared to 32 bits implementation by around 40% and 14 %, respectively. Hence, the optimized accelerator can be useful for low-complexity systems or limited resources such as in robot automation for smart navigation and smart mapping.