The Swin-DETR network model based on the improved DETR model is proposed to solve the problem that the resolution ratio of small targets is low in complex environment and it is not easy to be detected. Swin-T architecture is introduced into the backbone network of DETR model to improve the model’s deep extraction ability of parts of image features. Meanwhile, CBAM module is integrated into the backbone network. Through the fusion attention mechanism in CBAM module, the identification accuracy of small targets in the backbone network is further improved. Therefore, compared with the original DETR network, the identification accuracy and model accuracy of the small target in the target detection task are improved effectively. For the target detection task of multi-small targets in complex environment, through the comparison experiment with the current mainstream target detection model, it is proved that the algorithm can effectively improve the accuracy and recognition accuracy of the model for small targets.