In this paper, we consider optimizing a compressive measurement matrix (CMM) in a massive multiple-input multiple-output (MIMO) system that provides reliable detection capability of both strong and weak signals. To achieve this goal, we propose a reinforcement learning framework, wherein the base station acts as an agent and interacts with the environment to design the CMM by selecting appropriate actions based on a well-defined reward function. Our proposed framework yields improved weak signal detection capabilities. The optimized CMM obtained through the proposed method can then be utilized to reduce the dimension of the received signal, making it practical to implement a massive MIMO system by reducing the number of required radio frequency front-end circuits.