Simultaneous Localization and Mapping(SLAM) based on deep learning is investigated in this paper from the perspective of classical geometry method. After decades of advances in computer vision and robotics, state-of-the-art monocular SLAM algorithms have achieved incredible performance. Deep learning-based SLAM lacks generalization and needs to be retrained or fine-tuned in new environment, though it can autonomously learn more features suitable for the current environment. In this paper, the feature description of images in classical geometric methods is explored, and a new architecture of deep learning SLAM based on Zernike moments is proposed. The network can be trained in an end-to-end manner, that is, directly infer the camera position and pixel depth from a group of raw images or videos to complete localization and mapping work. Besides, a rotation invariant layer based on Zernike moment is constructed in the deep learning network. It can reduce the high frequency noise in the image and improve the robustness of the system when handling the rotation motion of monocular camera. The training of network is performed on the TartanAir dataset, and the testing is conducted on the EuRoC dataset. Experimental results demonstrate that the proposed system has excellent performance and better generalization ability in new environment. The new framework verifies that it is possible to combine deep learning-based SLAM with classical geometry method.