Traffic congestion has been a major problem for almost every city around the globe. Most of those cities get their traffic even worse in rush hour. In order to address the limitation of existing traffic light which is operated using fixed-time intervals, many researchers have been developing an intelligent traffic light that works in real time based on a machine learning algorithm. Unfortunately, it takes a lot of time to process the algorithm as the traffic is getting worse. In this paper, we propose a Q-Learning Accelerator Architecture for Intelligent Traffic Light Controller. It allows a faster learning time computation. Furthermore, the controller will allow traffic lights to work adaptively in real time. This architecture is compatible with a 4-arm crossroad using 4 possible actions which are traffic light signals operation. This architecture has been successfully implemented on PYNQ-ZI and achieve maximum clock frequency of 100 MHz. The achievable clock offers acceleration ratio approximately 1400 times against fully software Q-Learning on 2.3 GHz Intel Core i5 With 8 GB Memory. In addition, we also perform HW/SW co-verification by integrating HW design to the traffic simulator in order to validate system in realistic condition. Evaluation result shows that the proposed system offers better performance compare to conventional method, specifically can maintain the number of waiting vehicles. This result corresponds to reduction of congestion for around 75%.