Speech emotion recognition is an act of predicting human's emotion through their speech along with the accuracy of prediction. It creates a better human computer interaction. Though it is difficult to predict the emotion of a person as emotions are subjective and annotation audio is challenging, “Speech Emotion Recognition(SER)” makes this possible [1]. This is the same theory which is used by animals like dogs, elephants and horses etc do to be able to understand human emotion [1]. There are various states to predict one's emotion, they are tone, pitch, expression, behavior etc. Among them, few states are considered to find the emotion through the speech. Few samples are used to train the classifiers to perform speech emotion recognition [2]. Thhis research work considers the RAVDESS dataset (Ryerson Audio-Visual Database of Emotional Speech and Song dataset). Here, the three key features such as MFCC (Mel Frequency Cepstral Coefficients), Mel Spectrogram and chroma are extracted.