Gaze tracking is an assistant system of human-computer interaction. Aiming at the problem of high misjudgment rate and long time-consuming of traditional iris location methods, this paper proposes a gaze tracking method based on human eye geometric characteristics to improve the tracking accuracy in 2D environment. Firstly, the human face is located by face location algorithm and the position of human eye is estimated roughly. Then the iris template is built by iris image, and the iris center location algorithm is used to locate the iris center position. Finally, the eyes corners and iris center points are extracted to locate the eye area accurately and obtain the binocular image. The binocular images are input into the feature extraction network as multi-modal information in parallel, and the convoluted feature channels are reconstructed using the weight redistribution module in the network. Then the reconstructed features are fused in the full connection layer. Finally, the output layer is used to classify the reconstructed features. Experiments were carried out on a self-built screen block data set. For 12 classified data, the lowest recognition error rate is 5.34%.