Infrared and visible image fusion (IVIF) is to achieve the fused images with multimodal complementary information of source images. To effectively fuse the complementary information, a dual-encoder network based on multi-layer feature fusion for IVIF is proposed, which can effectively fuse the features of source images at different levels. Specifically, a deep semantic information fusion module (DSIFM) is constructed to merge the deep-level features of the network at different scales. Meanwhile, considering the difference between infrared and visible features, a shallow-middle information fusion module (SMIFM) is built to integrate the shallow and middle features obtained by two encoders with the deep features delivered by the network. Furthermore, a joint loss function including sensitivity loss and structural similarity loss is defined to preserve the salient targets and texture features of source images in the fusion result. Qualitative and quantitative experimental results confirm the superiority of the proposed method over state-of-the-art approaches. Specifically, our method improves the MI and SSIM metrics by 60.78% and 17.45%, respectively, compared to the deep learning-based approach with the best average values on the TNO dataset. Our code is available on the website: https://github.com/yotick/IJMLC.