The detection of defects in printed circuit boards requires high accuracy and realtime performance. Existing industrial detection models generally adopt a pure convolutional structure for ease of deployment. However, the detection accuracy of these models is often insufficient to meet the requirements of the scene. To improve the detection model accuracy and ease of deployment, this paper proposes a convolutional merging Transformer network(CMTRNet). The CMTR-Net model proposes a backbone network (CNN-Former) that uses convolutional modules to replace self-attention, combining the Transformer architecture with a convolutional structure. This approach not only avoids the drawback of self-attention high computation complexity that is detrimental to deployment but also improves the model detection accuracy. Based on CNN-Former, this paper also proposes a feature fusion module that can better fuse the features extracted by CNN-Former. Further-more, based on the CMTRNet model and the characteristics of circuit board defects, this paper proposes a loss function called Melt-IoU, which can make the initial training phase smoother and further improve detection accuracy. Experiments have shown that CMTRNet outperforms existing advanced models on both datasets.