학술논문

Home

자료검색

학술논문

검색결과 돌아가기

검색화면

내보내기 프린트

Improving The Latency And Quality Of Cascaded Encoders

Resource Type: Conference
Authors: Sainath, Tara N.; He, Yanzhang; Narayanan, Arun; Botros, Rami; Wang, Weiran; Qiu, David; Chiu, Chung-Cheng; Prabhavalkar, Rohit; Gruenstein, Alexander; Gulati, Anmol; Li, Bo; Rybach, David; Guzman, Emmanuel; McGraw, Ian; Qin, James; Choromanski, Krzysztof; Liang, Qiao; David, Robert; Pang, Ruoming; Chang, Shuo-Yiin; Strohman, Trevor; Huang, W. Ronny; Han, Wei; Wu, Yonghui; Zhang, Yu
Source: ICASSP 2022 - 2022 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) Acoustics, Speech and Signal Processing (ICASSP), ICASSP 2022 - 2022 IEEE International Conference on. :8112-8116 May, 2022
Subject: Bioengineering
Communication, Networking and Broadcast Technologies
Computing and Processing
Signal Processing and Analysis
Training
Computational modeling
Conferences
Computer architecture
Signal processing
Acoustics
Speech processing
end-to-end ASR
rnnt
conformer
long-form ASR
two-pass ASR
Language
ISSN: 2379-190X

Online Access

Full Text (IEEE)

초록

In this paper, we explore reducing computational latency of the 2-pass cascaded encoder model [1]. Specifically, we experiment with reducing the size of the causal 1st-pass and adding capacity to the non-causal 2nd-pass, such that the overall latency can be reduced without loss of quality. In addition, we explore using a confidence model for deciding to stop 2nd-pass recognition if we are confident in the 1st-pass hypothesis. Overall, we are able to reduce latency by a factor of 1.7X, compared to the baseline cascaded encoder from [1]. Secondly, with the added capacity in the non-causal 2nd-pass, we find that we can improve WER by up to 7% relative using wav2vec and minimum word-error-rate (MWER) training.

공지

DAU Library

학술논문

요약정보

Improving The Latency And Quality Of Cascaded Encoders

Online Access

초록