The computer has seen significant evolution in recent years, with applications in variety of disciplines like Machine Learning, Deep Learning etc. Machine Translation (MT) technology has advanced significantly as a subfield, with many methodologies and techniques. The number of individuals using the internet has risen tremendously. Most documents are written in English since that is the most extensively used language on the internet. If a user’s first dialect is Gujarati, he or she will naturally prefer to access the information in Gujarati whenever possible. Even though there are already various MT systems and tools that support Indian languages; however, the translation’s quality is mediocre and might be improved. As observed, when models are trained on limited quantities of parallel data, their performance declines. Learned models often have limited performance (inaccurate translations and feature scores) along with low coverage (high out-of-vocabulary rates). Furthermore, the researchers were driven to present novel methodologies and solutions that would automatically construct Datasets for MT due to the increased demand for effective technologies to process and translate information from/to Gujarati Language. Our objective of generating a Gujarati-English dataset has been met in two ways. We have already introduced GEDset, that is automatic Dataset Builder for Machine Translation System with Specific Reference to Gujarati-English [1]. Here, in this paper we are proposing a model to automatically build Gujarati-English dataset from audio that is available in Gujarati Language through Speech Processing.