Injecting Text and Cross-Lingual Supervision in Few-Shot Learning from Self-Supervised Models
- Resource Type
- Conference
- Authors
- Wiesner, Matthew; Raj, Desh; Khudanpur, Sanjeev
- Source
- ICASSP 2022 - 2022 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) Acoustics, Speech and Signal Processing (ICASSP), ICASSP 2022 - 2022 IEEE International Conference on. :8597-8601 May, 2022
- Subject
- Bioengineering
Communication, Networking and Broadcast Technologies
Computing and Processing
Signal Processing and Analysis
Conferences
Signal processing
Acoustics
Speech processing
Mutual information
Self-supervised
few-shot learning
lattice-free MMI
cross-lingual ASR
- Language
- ISSN
- 2379-190X
Self-supervised model pretraining has recently garnered significant interest. However, using additional resources in fine-tuning these models has received less attention. We demonstrate how universal phoneset acoustic models can leverage cross-lingual supervision to improve transfer of pretrained self-supervised representations to new languages. We also show how target-language text can be used to enable and improve fine-tuning with the lattice-free maximum mutual information (LF-MMI) objective. In three low-resource languages these techniques greatly improved few-shot learning performance.