학술논문

Home

자료검색

학술논문

검색결과 돌아가기

검색화면

내보내기 프린트

Fine-tuning pretrained transformer encoders for sequence-to-sequence learning

Resource Type: Original Paper
Authors: Bao, Hangbo; Dong, Li; Wang, Wenhui; Yang, Nan; Piao, Songhao; Wei, Furu
Source: International Journal of Machine Learning and Cybernetics. 15(5):1711-1728
Subject: Pretrained models
Transformer
Natural language generation
Document summarization
Question generation
Multilingual
Language: English
ISSN: 1868-8071
1868-808X

Online Access

초록

In this paper, we introduce s2s-ft, a method for adapting pretrained bidirectional Transformer encoders, such as BERT and RoBERTa, to sequence-to-sequence tasks like abstractive summarization and question generation. By employing a unified modeling approach and well-designed self-attention masks, s2s-ft leverages the generative capabilities of pretrained Transformer encoders without the need for an additional decoder. We conduct extensive experiments comparing three fine-tuning algorithms (causal fine-tuning, masked fine-tuning, and pseudo-masked fine-tuning) and various pretrained models for initialization. Results demonstrate that s2s-ft achieves strong performance across different tasks and languages. Additionally, the method is successfully extended to multilingual pretrained models, such as XLM-RoBERTa, and evaluated on multilingual generation tasks. Our work highlights the importance of reducing the discrepancy between masked language model pretraining and sequence-to-sequence fine-tuning and showcases the effectiveness and expansibility of the s2s-ft method.

공지

DAU Library

학술논문

요약정보

Fine-tuning pretrained transformer encoders for sequence-to-sequence learning

Online Access

초록