UCL-AST: Active Self-Training with Uncertainty-Aware Clouded Logits for Few-Shot Text Classification
- Resource Type
- Conference
- Authors
- Xu, Yi; Hu, Jie; Gao, Zhiqiao; Chen, Jinpeng
- Source
- 2022 IEEE 34th International Conference on Tools with Artificial Intelligence (ICTAI) ICTAI Tools with Artificial Intelligence (ICTAI), 2022 IEEE 34th International Conference on. :1390-1395 Oct, 2022
- Subject
- Bioengineering
Computing and Processing
Robotics and Control Systems
Training
Visualization
Uncertainty
Costs
Annotations
Text categorization
Measurement uncertainty
active learning
softmax saturation
few-shot text classification
- Language
- ISSN
- 2375-0197
Although the recent advances in pre-training language models have achieved great success and migrated annotation bottleneck for many tasks, the task-specific fine-tuning for text classification still requires thousands of labeled samples or even more. When few human annotations are available, active self-training is one of the efficient approaches for reducing human annotation costs which integrates pseudo labels of self-training into the procedure of active learning. The traditional active self-training approaches mix manual labels and pseudo labels indistinguishably without considering the impact of active learning and self-training on each other. There are two problems for active self-training in few-shot text classification: (1) The noise in a large number of pseudo-labels may mislead the model training. (2) The confusing samples in overlap boundaries between classes are especially difficult to identify correctly with a few labels. To solve these problems, we propose UCL-AST, an active self-training framework that boosts the performance of the active teacher model to provide superior pseudo labels for the self-training student model and facilitates the active teacher model to learn more clear class boundaries with uncertainty-aware clouded logits in few-shot. The extensive experiments and visualization analysis have demonstrated the effectiveness of the proposed framework on four datasets compared with state-of-the-art methods.