Autonomous parking using visual-based localization in an underground park lot is a challenging task due to repetitive structure, bad lighting conditions, texture-less scenes, and high positioning accuracy requirements. In this paper, we proposed a two-step direct method-based visual localization algorithm using ground-based semantic features. First, road markings, parking slot lines, guide signs are segmented from bird’s eye view camera images through the deep learning method. These semantic features are then utilized to build a global visual map of the parking lot using poses generated by high precision Lidar mapping method. Loop detection and graph optimization are employed to eliminate drifts. With this map, we achieve a real-time localization by a two-step search algorithm using direct method. The global re-localization is realized by combining the bag of word method and the direct matching method. The experimental results show that with a pre-built map, the vehicle can be re-localized at random locations in the parking lot, and accurate localization can be obtained in real-time with only visual input.