Data augmentation techniques, such as rotations, masks, and noise, have become critical and effective tools in addressing the scarcity of common datasets in various deep learning tasks. This article proposes a novel learnable neural data augmentation strategy specifically tailored for the task of supervised contrastive representation learning (SCL) in fault diagnosis when faced with limited fault data. The core concept revolves around generating a set of transformations that maintain similar semantic information to the original sample while exhibiting dissimilarity from one another. Our approach encompasses two stages: the generation of augmented data and SCL-based feature extraction. Many experiments conducted on the Tennessee Eastman process (TEP) benchmark dataset reveal that our method outperforms existing data augmentation strategies and fault diagnosis methods, surpassing performance even without the utilization of any prior information.