Aiming at the problems of dynamic target interference and occlusion in complex scenes of monocular visual localization system, a feature-enhanced visual localization algorithm based on attention mechanism is proposed. The image depth information and pose information are fused by the depth estimation network and the pose estimation network, and the tightly coupled training of the two-branch network is realized by using the view reconstruction loss. In order to further improve the feature representation ability of the network, the occlusion mask module and the attention mechanism module are respectively integrated, the binary mask is constructed using the depth information of adjacent images and the photometric loss is optimized, and the saliency of key features is improved by using channel attention and spatial attention. The experimental results show that the algorithm can effectively improve the feature representation ability of the network, alleviate the problem of scale error accumulation under long-term sequence tracking, and obtain higher localization accuracy.