Deep learning has developed rapidly in recent years, not only in image recognition, but now also in action recognition. The research on action recognition started with 3D-CNN, which has achieved good results on many tasks. But most action recognition networks have room for improvement in fine-grained action recognition. The reason is that there is only a slight difference between categories in the fine-grained classification task. e.g. basketball fouls only occur in a few frames and a small region. This situation may lead to some errors with 3D-CNN methods because these models tend to merge all temporal features. To identify these fouls, it is necessary to strengthen the detection of small periods. In this paper, we propose a temporal score network suitable for existing networks, including 3D-Resnet50, 3D-wide-Resnet50, $\mathbf{R}\mathbf{(}\mathbf{2}\mathbf{+}\mathbf{1}\mathbf{)}$ D-Resnet50, and I3D-50 to improve the accuracy of fine-grained action recognition. The experimental results show that the accuracy of various models is improved by 3.85% to 6% after adding the proposed network. Since there is no relevant public dataset, we collect the data ourselves to create a basketball foul dataset.