Recently, many algorithms employing deep neural networks have been developed for image compressed sensing (CS) and have made significant strides in reconstructing image details. However, the performance of DNN-based reconstruction methods is limited by the fact that the inherent inductive bias of convolutional operations prevents them from capturing long-range dependencies of images. In this work, we propose an Efficient Multi-interval Attention Retractable Network (EMARNet) for CS image reconstruction. EMARNet employs a network architecture composed of multiple residual Transformer blocks as the core of the reconstruction process. Specifically, each Transformer block comprises a sequence of interconnected Dense Attention Block (DAB) and Group-wise Multi-interval Sparse Attention Block (GM-SAB), alongside a residual connection. The cascaded DABs and GM-SABs can effectively capture the global information and achieve better reconstruction quality. Experimental results on diverse benchmark datasets confirm the superior performance of our proposed approach compare to state-of-the-art CS methods.