Font design is an important research direction in art design and has high commercial value. It requires professionals to design fonts, which is not only time-consuming and costly, but also inefficient. Font-to-font translation is a commonly used font design method. Font-to-font translation is essentially the problem of image synthesis. Currently, generative adversarial networks (GANs) have been used for image synthesis and achieved some results. However, for the task of font-to-font translation the existing methods based on GANs generally have low-quality visual effects, such as incomplete fonts and distortion of font details. In order to solve the above problems, we propose a more effective multi-scale CycleGAN for font-to-font translation and the proposed method can obtain the font images with better visual quality. The proposed method is called MSM-CycleGAN. In MSM-CycleGAN, a U-net with multiple outputs (UM) is used as the generator. UM outputs the generated images of multiple scales. And then the outputs of UM are fed into the multi-scale discriminator. Our model uses the unsupervised learning method. This multi-scale discrimination method effectively improves the detailed information of the generated image. Experimental results show that our method performs better than other state-of-the-art image synthesis methods, and can obtain the font images with higher visual quality.