Application Research of Text Classification Based on Random Forest Algorithm
- Resource Type
- Conference
- Authors
- Sun, Yanxiong; Li, Yeli; Zeng, Qingtao; Bian, Yuning
- Source
- 2020 3rd International Conference on Advanced Electronic Materials, Computers and Software Engineering (AEMCSE) AEMCSE Advanced Electronic Materials, Computers and Software Engineering (AEMCSE), 2020 3rd International Conference on. :370-374 Apr, 2020
- Subject
- Computing and Processing
Random Forest
Text Classification
tr-k method
- Language
In view of the poor classification effect of traditional random forest algorithm due to the low quality of text feature extraction, a random forest method for text information is proposed. In view of the difficulty in controlling the quality of traditional random forest decision trees, a weighted voting mechanism is proposed to improve the quality of decision trees. This algorithm uses tr-k method based on text feature extraction to improve the quality and diversity of text features, and uses the latest Bert word vector generation model to represent the text. Experimental data in Python environment show that this method can achieve better results in text classification than IDF based random forest algorithm and original random forest algorithm.