Action recognition of earth-moving machinery plays an important part in the field of industry automation and unmanned monitoring. However, current action recognition methods suffer from the low accuracy of action recognition by complex background, especially when lots of irrelevant targets move in background. To solve this problem, an action video database on earth-moving machinery, which contains 476 videos, is built. Furthermore, attention based on pseudo 3D convolutional residual network is proposed. Specifically, this network first performs spatial two-dimensional convolution on the input video, and then performs one-dimensional convolution on time, and then integrates the channel attention mechanism into the network. The experiment results validate that our proposed network outperforms the state-of-the-art action recognition algorithms.