Lung diseases are among the most deadly infectious diseases worldwide. Covid-19 infection is a current disease that falls within this category and has impacted public health in countries across the globe. Accordingly, this study focuses on building a lung disease identification system using a state-of-the-art deep cascade learning classification model, EfficientNet-Vision Transformer. The proposed Real ESRGAN is utilized to enhance the input of EfficientNet, while image Relative Position Encoding (iRPE) is added to improve the attention of the transformer network. Moreover, weight balancing is applied to stabilize the performance of the proposed system. When trained on the X-Ray dataset, our model achieved 93.757% accuracy on five classes of lung disease: Normal, Covid-19, Viral Pneumonia, Bacterial Pneumonia, and Tuberculosis.