Gaze estimation is a vital indicator of human behavior and has diverse applications, such as virtual reality, human-machine interaction, and medical analysis. However, it remains a challenging task to accurately predict the direction of a person’s gaze due to factors like unique eye appearance, varying lighting conditions, and head poses. Recent advancements in deep learning have improved appearance-based gaze estimation, but there is still room for improvement. In this paper, we propose a multi-loss convolutional neural network that combines coarse and fine classification to achieve higher accuracy. Our model uses an EfficientNet-B3 backbone and a fine-grained classifier trained with the aid of other coarse-angle units. The final prediction result uses integrated regression. We evaluate our model on four datasets collected under unconstrained settings and demonstrate its state-of-the-art accuracy of 3.89°, 6.93°, and 10.78° on MPIIFaceGaze, RT-Gene, and Gaze360 datasets, respectively. The code will be released on https://github.com/hhqiang/HG-Net/ after the paper’s acceptance.