This work presents a novel method for image-to-image translation named X-Bridge. The method is based on a conditional adversarial network. X-Bridge is a supervised method build upon the Pix2pix approach, however, it extends the original system with an additional reconstruction path and a shared-latent space assumption between the original and the reconstruction path. With these modifications, we argue that the qualitative results provided by X-Bridge overcome other state-of-the-art methods in terms of similarity between translated and corresponding images, robustness, generalization capacity, and translated features preservation. This assumption is confirmed with provided quantitative results. We demonstrate the power of this approach on the challenging facial image-to-sketch translation task. Code is available at: https://github.com/YvanG/Cross-modal-Bridge.