Electroencephalography (EEG)-based classification of brain disease such as epilepsy or schizophrenia, decoding brain activity during movement and vision have been shown promising results in the last years. Here, we introduce a novel pipeline for the presence of speech information carried on EEG signals. The proposed work includes a new conducted EEG dataset of 15 subjects and a deep learning model to predict the colour information. With a unique experimental set up, the data successfully captures the information about the mental enunciation of the set of used colors. The primary goal is to perform multiclass classification using our custom EEG data which records the brain activity of individuals during mental enunciation and thought about a class of objects, in our case, colours. Continuous Wavelet Transform (CWT) is applied on each of the EEG channels of each participant to obtain time-frequency (TF) based characteristics. A Vision Transformer (ViT) based model is then developed and used to capture information from these TF. The method deals with a 6-class classification problem, for which, the 6 different colors are used as target classes for our model. The proposed model achieves 91.36% cross validation accuracy, 5.48x the random guess accuracy. These results clearly demonstrate the existence of speech information in EEG signals and lay the foundational stone for future research in speech assistive technologies.