To fully understand the concentration degree of each student in the classroom, it is necessary to build a model of student concentration detection. This paper establishes a concentration analysis model to quantify the concentration index of students by analyzing the facial expression, eye attention, and head posture of middle school students in surveillance videos. The model adopts convolutional neural network technology and a Transformer self-attention mechanism. The practice results show that the error loss value of the model is close to 0, and it has a good prediction effect.