This paper presents a new method of parsing indoor scene from an RGB-D image using superpixel of the RGB image and region merging of depth information. The goal of parsing indoor scene is to reconstruct the interior wall and floor scene based on an RGB-D image, the architecture of Manhattan structure. Firstly, an original RGB image of an indoor scene is segmented into superpixels using SLIC method, and a region merging procedure is used to iteratively merge adjacent superpixels possessing some identical properties in RGB-and depth-channel, which assures that the pixels in a merged region have basically identical normal, which is essential for reconstructing indoor wall and floor plane from a RGB-D image. Secondly, the obtained regions of the scene are translated into plane surface using normal information of regions, and then wall segments and walls are extracted to remove the influence of furniture and persons in the indoor scene on the reconstruction of indoor wall and floor plane. Finally, the Manhattan structure wall and floor scene of an indoor scene can be reconstructed by using dynamic programming method on candidate wall segments. The experiments show that the proposed method obtains better 3D structures than the ones the state-of-the-art produced.