Recently generative adversarial network (GAN) has been explored to multimodal template matching. Existing GAN-based multimodal template matching methods exploit the image generation to transform the multimodal template matching task as the unimodal one. However, image synthesis-based multimodal template matching methods relay on the generated image quality, which is unstable. Towards this end, this paper proposes a feature adversarial network, which maps different modal images into a common subspace and learns the correlation in the subspace. Specifically, the feature mapper is designed to map the multimodal features into intermediate features, and the modality discriminator is proposed to optimize the multimodal intermediate features until they are indistinguishable. Thus, an effective common feature subspace is generated for correlation learning. The experimental results on a public dataset demonstrate the superiority of the proposed method.