Short-term wind field forecasting, particularly wind speed prediction, is a critical component in wind farm control. Traditional statistical techniques, such as autoregressive integrated moving average, and machine learning approaches, such as recurrent neural networks, have been widely used for short-term forecasting, but these methods are often difficult to interpret. In this study, we propose a transformer model that can capture the internal dependencies within previous wind measurements to predict wind speeds in the upcoming hours. Measurement data from the FINO1 research platform collected in 2007 and 2008, including wind speeds, direction, and air temperature at three altitudes (40m, 60m, and 80m), have been used to train and evaluate the model. With the help of the attention mechanism in the transformer model, we can interpret the impact of the previous wind profile on future predictions, thereby enabling the explanation of deep learning models for wind forecasting. Our experimental results on the FINO1 measurement data demonstrate exceptional performance of the proposed model in short-term wind forecasting.