The significance of disease detection approaches based on deep learning (DL) in medical research, driven by artificial intelligence (AI), is gaining considerable attention. However, research in this domain encounters challenges in achieving the desired level of progress. These challenges stem from the diverse range of health diseases and the unique regional characteristics associated with many of these disease types. Among the diseases affecting the eyes, cataracts, a frequently encountered eye condition, can lead to visual impairments. Detecting cataracts accurately and in a timely manner is crucial for effective risk management and preventing the potential progression toward blindness. This paper introduces a deep neural network that utilizes convolutional neural network (CNN) models, namely VGG16 and ResNet50, and a Vision Transformer (ViT) based approach. These models are specifically designed for automatic cataract detection in eye images. Additionally, media noise filtering, implemented as median filtering, is employed as a preprocessing technique to reduce noise and enhance overall image quality. In addition, methods of data augmentation are utilised to combat the problem of overfitting. These methods involve expanding the size of the dataset prior to the training of the model. Based on the results of the experimental study, it is evident that the ViT method outperforms existing cataract detection approaches, demonstrating an impressive accuracy of 70%.