Reconfigurable intelligent surface (RIS) is considered as one of the key enabling technologies for future 6G wireless communication by realizing an intelligent radio environment. RIS is used as reflective array to change the transmission and coverage of radio frequency (RF) signals. In this paper, we propose a deep reinforcement learning (DRL) based RIS beamforming design in practical scenarios where RIS may have hardware loss, and the soft actor-critic (SAC)-exploration algorithm is presented to solve the beamforming design. The algorithm reduces the prediction error by introducing a perturbation signal to influence the action prediction. Simulation results show that our proposed SAC-exploration algorithm has significant improvement over the typical SAC algorithm, which verifies the effectiveness of the proposed algorithm,