This paper considers a stochastic linear quadratic problem for discrete-time systems with multiplicative noises over an infinite horizon. To obtain the optimal solution, we propose an online iterative algorithm of reinforcement learning based on Bellman dynamic programming principle. The algorithm avoids the direct calculation of algebra Riccati equations. It merely takes advantage of state trajectories over a short interval instead of all iterations, significantly simplifying the calculation process. Under the stabilizable initial values, numerical examples shed light on our theoretical results.