Images captured under poor illumination suffer from poor visibility and weak information. The majority of low-light image enhancement (LLIE) methods are based on convolutional neural networks (CNNs), which leads to their inability to establish long-range context interaction. In this paper, we take the advantages of the Swin Transformer and ResNet to develop the Swin Transformer-based unsupervised Generative Adversarial Network (STUN) for the LLIE task. The STUN contains one generator and one discriminator. The generator includes modules for feature extraction, deep feature processing and image reconstruction. In particular, we use Swin Transformer blocks and ResNet in the deep feature processing module alternately to calculate the global and local attention. The self-feature preserving loss and the spatial consistency loss are employed to constrain the unsupervised learning of STUN. Experimental results on several low-light datasets indicate that STUN can achieve great performance in both visual quality and evaluation metrics.