Partial discharge (PD) is the initial stage of a complete failure in some power systems’ components, such as electrical machines, cables, covered conductors, etc. If left without repair, these phenomena can eventually lead to substantial power outages and damages. The advanced approaches for PD detection rely on statistical feature extraction and conventional machine learning methods; however, the performance of these methods will decrease in the presence of noise. This study investigates a solution for PD fault detection in Medium Voltage Covered Conductor Overhead lines (MVCCO) using a deep learning method based on the Long Term Short Memory (LSTM) and Attention layers. A k-fold stratified cross-validation method is used for training and validation. Also, the impacts of some hyperparameters on the deep learning model and the classification result are investigated. The proposed method is applied to a large open-source dataset of signals with PD fault provided by VSB’s ENET center. The obtained results are compared with some traditional machine learning methods, which proved the superiority of the proposed method over the conventional techniques in terms of detecting a faulty signal.