eArticles

Home

eArticles

검색결과 돌아가기

검색화면

Export 프린트

CTC-Based End-to-End Speech Recognition for Low Resource Language Sanskrit

Resource Type: Conference
Authors: Suhani; Dev, Amita; Bansal, Poonam
Source: 2023 26th Conference of the Oriental COCOSDA International Committee for the Co-ordination and Standardisation of Speech Databases and Assessment Techniques (O-COCOSDA) Co-ordination and Standardisation of Speech Databases and Assessment Techniques (O-COCOSDA), 2023 26th Conference of the Oriental COCOSDA International Committee for the. :1-5 Dec, 2023
Subject: Computing and Processing
Engineering Profession
Signal Processing and Analysis
Deep learning
Recurrent neural networks
Databases
Information retrieval
Robustness
Classification algorithms
Automatic speech recognition
Automatic Speech Recognition (ASR)
Sanskrit
MFCC
CNN
RNN
CTC
Language
ISSN: 2472-7695

Online Access

Full Text (IEEE)

초록

Automatic Speech Recognition (ASR) has grown enormously over the past ten years, attracting much interest and attention. Implementing their systems, language adaption, and performance robustness remain some of the major obstacles. Sanskrit presents a challenge for developing such systems since it is a more complex language than other languages and lacks common databases. Deep learning is widely applied in numerous study domains and has established greater importance. The capability of a machine or a program to recognize spoken statements or to translate speech is known as automated speech recognition. It requires the ability to contrast a vocal pattern with a pre-existing or previously learned set of words. This study aims to develop an effectively optimized recurrent neural network (RNN) and CNN-based Sanskrit speech recognition system. Additionally, the Connectionist Temporal Classification (CTC) loss function is employed to increase the likelihood of accurate transcription. Using 46,000 utterances from 27 distinct speakers, the algorithm research has been trained to recognize Sanskrit. The experimental results show potential for automated processing of valuable information extraction. Accurately processing language accents is essential for successful human interaction, and this plays a role in advancing more streamlined and efficient approaches to accomplish this objective.

공지

DAU Library

eArticles

요약정보

CTC-Based End-to-End Speech Recognition for Low Resource Language Sanskrit

Online Access

초록