학술논문

Home

자료검색

학술논문

검색결과 돌아가기

검색화면

내보내기 프린트

DsetGenS: An Automated Technique for Building Dataset From Speech with respect to Gujarati-English

Resource Type: Conference
Authors: Patel, Margi; Joshi, Brijendra Kumar
Source: 2022 IEEE 11th International Conference on Communication Systems and Network Technologies (CSNT) Communication Systems and Network Technologies (CSNT), 2022 IEEE 11th International Conference on. :314-317 Apr, 2022
Subject: Communication, Networking and Broadcast Technologies
Components, Circuits, Devices and Systems
Fields, Waves and Electromagnetics
Power, Energy and Industry Applications
Signal Processing and Analysis
Training
Deep learning
Machine learning algorithms
Conferences
Pipelines
Data models
Machine translation
Machine Translation
Speech Processing
Dataset
Machine Learning
Language

Online Access

Full Text (IEEE)

초록

The computer has seen significant evolution in recent years, with applications in variety of disciplines like Machine Learning, Deep Learning etc. Machine Translation (MT) technology has advanced significantly as a subfield, with many methodologies and techniques. The number of individuals using the internet has risen tremendously. Most documents are written in English since that is the most extensively used language on the internet. If a user’s first dialect is Gujarati, he or she will naturally prefer to access the information in Gujarati whenever possible. Even though there are already various MT systems and tools that support Indian languages; however, the translation’s quality is mediocre and might be improved. As observed, when models are trained on limited quantities of parallel data, their performance declines. Learned models often have limited performance (inaccurate translations and feature scores) along with low coverage (high out-of-vocabulary rates). Furthermore, the researchers were driven to present novel methodologies and solutions that would automatically construct Datasets for MT due to the increased demand for effective technologies to process and translate information from/to Gujarati Language. Our objective of generating a Gujarati-English dataset has been met in two ways. We have already introduced GEDset, that is automatic Dataset Builder for Machine Translation System with Specific Reference to Gujarati-English [1]. Here, in this paper we are proposing a model to automatically build Gujarati-English dataset from audio that is available in Gujarati Language through Speech Processing.

공지

DAU Library

학술논문

요약정보

DsetGenS: An Automated Technique for Building Dataset From Speech with respect to Gujarati-English

Online Access

초록