Our paper proposes an unsupervised learning framework for knowledge distillation based on teacher-student networks for anomaly detection and classification in secure document images. The teacher network uses Wide-ResNet-50 as the backbone network and is pre-trained on large datasets of secure document images and ImageNet. We use a multi-scale feature pyramid matching strategy, so that the student network can receive multi-scale feature maps to better detect anomalies of various sizes. We introduce an attention mechanism to transfer the attention map of the middle layer of the teacher model to the student model as knowledge, hoping that the student model will focus on the region that the teacher model focuses on, thereby distinguishing the difference between normal regions and abnormal regions, and improving the accuracy of detecting anomalies. Our method is applied in the field of Security document to achieve accurate and fast anomaly detection.