Most ear recognition techniques use cropped ear images, as they are, with backgrounds, hair, part of the face or neck skin, and even cloths. These non-ear pixels of the image can negatively affect the classification decision. To avoid that, and to make sure that the classifier depends on ear pixels only, we propose using a tight Region-of-Interest (RoI) segmentation of the ear instead. This paper uses Image-to-Image translation to synthesize ear RoI segmentation and remove irrelevant pixels from input images. Furthermore, missing parts of the ear due to occlusion or distortion can also be synthesized. To accomplish that, we used Pix2Pix Generative Adversarial Network (GAN) trained on the AWE dataset, which is a challenging ear dataset. Experimental results show that using ear RoI segmentation positively affects the classification process, and significantly increases the recognition rate.