Visible cameras frequently fall short of satisfying the visual requirements of individuals in low-light conditions partic-ularly at night. Considering the principle of thermal imaging in infrared cameras, a network framework that involves encoding, fusion, and decoding was proposed to fuse visible images and infrared images, leveraging the strengths of both modalities. The encoding network utilizes densely connected convolutional layers to globally extract features from visible images and infrared images. The proposed fusion network based on L1 distance, integrates complementary information and enhances contrast and texture details through its fusion capability by fusing globally extracted features from the encoder in three different dimensions. Finally, the decoding network reconstructs the fused global features to obtain the fused image. Extensive experiments on public datasets demonstrate the proposed method achieves promising performance in both subjective and objective evaluations.