학술논문

Home

자료검색

학술논문

검색결과 돌아가기

검색화면

내보내기 프린트

H- and C-level WFST-based large vocabulary continuous speech recognition on Graphics Processing Units

Resource Type: Conference
Authors: Kim, Jungsuk; You, Kisun; Sung, Wonyong
Source: 2011 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) Acoustics, Speech and Signal Processing (ICASSP), 2011 IEEE International Conference on. :1733-1736 May, 2011
Subject: Signal Processing and Analysis
Communication, Networking and Broadcast Technologies
Computing and Processing
Graphics processing unit
Hidden Markov models
Synchronization
History
Speech recognition
Decoding
Instruction sets
WFSTs
Graphics Processing Unit
Parallelization
Word-length optimization
Language
ISSN: 1520-6149
2379-190X

Online Access

Full Text (IEEE)

초록

We have implemented 20,000-word large vocabulary continuous speech recognition (LVCSR) systems employing H- and C-level weighted finite state transducer (WFST) based networks on Graphics Processing Units (GPUs). Both the emission probability computation and the Viterbi beam search are implemented on the GPU in a data-parallel manner to minimize the extra data transfer time between the host CPU and the GPU. This study utilizes word-length optimization techniques to reduce the synchronization overhead in the Viterbi beam search. We achieve 18.6% to 21.9% of speed-up by using an efficient data packing method with less than 0.2% accuracy degradation. Furthermore, we explore different levels of abstraction in recognition network generation to reduce the number of synchronization operations as well as to minimize the memory usage. The experimental results show that the implemented systems on the GPU perform speech recognition 4.07 to 4.55 times faster than highly optimized sequential implementations on a CPU.

공지

DAU Library

학술논문

요약정보

H- and C-level WFST-based large vocabulary continuous speech recognition on Graphics Processing Units

Online Access

초록