Three-dimensional data can improve the performance of computer vision downstream tasks. In this paper, computer vision is introduced from the two-dimensional world to the three-dimensional world through the three-dimensional reconstruction algorithm. A 3D reconstruction algorithm, 3D Masked autoencoder Transformer Reconstruction (3D-MTR), is proposed for deep understanding of series view to restore 3D model. 3D-MTR can reconstruct 3D models with high-level semantic information in the series of views. Among them, in order to accurately extract and fuse the feature vectors with spatial information in the series of views, the MAE self-supervised learning pre-training is performed on the two-dimensional encoder; in order to make the reconstructed model more realistic, the Transformer-based two-dimensional encoder is compared with adversarial training. Combined, the model is updated through advanced semantic information; in order to make up for the uncontrollable defect of adversarial training, a repair network is added at the end of the model. In order to verify the validity of the model, a comparison experiment was conducted with three classic 3D reconstruction models on the ShapeNet dataset, using the Chamfer distance and the intersection-over-union ratio as the model evaluation indicators, and the experiments were conducted. Perform qualitative and quantitative analysis.