Accurate human body orientation estimation (HBOE) can significantly promote the analysis of human behavior. However, conventional methods cannot holistically exploit the complementary nature of spatial and temporal information for H-BOE. Different from existing methods, we propose an end-to-end temporal-spatial deep learning framework to accurately estimate the human body orientation. In this framework, we firstly utilize the convolutional neural network to capture the spatial information for human orientation. Furthermore, the spatial-temporal information are fused in the recurrent neural networks (RNNs), which can automatically memorize a long-term temporal information of human orientation transformation. More important, to effectively adapt different moving speeds and diversity actions of people, we design a weighted sequence loss function, which can capture the significant orientation conversion to guide the RNN training. According to the comprehensive evaluations, the proposed method greatly outperforms the states-of-the-art methods. Although only utilizing the 2D information, it can perform better than the 3-D/RGB-D based approaches.