It is of great significance for rural planning and urbanization to extract rural block-style residential areas rapidly and accurately. This paper put forward a technical method for accurate extraction of rural residential areas from multi-scale remote sensing images by combining PgNet, a visual attention mechanism algorithm, with YOLOv7, an object detection algorithm. In this study, YOLOv7 is utilized as the detector of coarse positioning of rural blocks, and the pain point of residential area positioning is solved by the pre-retrieval mechanism. The saliency detection algorithm of PgNet is used to accurately extract the candidate areas, which further solves the problem of precision loss caused by scale change. On the CBDV1.0 clustered building data set and self-built data set, experiments are conducted to verify the feasibility and effectiveness of the method proposed by this paper.