A Word Similarity Algorithm with Sememe Probability Density Ratio Based on HowNet
- Resource Type
- Article
Text
- Authors
- Rui Zheng; Huan Zhao; Xixiang Zhang
- Source
- International Journal of Hybrid Information Technology, 10/30/2015, Vol. 8, Issue 10, p. 417-426
- Subject
- word similarity
HowNet
sememe probability density
- Language
- English
- ISSN
- 1738-9968
The study on word similarity computation plays an important role in natural language processing (NLP). Recently the algorithm based on HowNet is widely used and proves to work well in Chinese word similarity computation. However, the relationship between the number of brother nodes and the fineness of the hierarchy is not considered. This paper investigates the ratio of two words on the brother nodes’ number called sememe probability density and proposes an improved algorithm based on HowNet. The results indicate that the correlation measure of the algorithm presented by this paper is 75.4%, and it is much better than the major state-of-the-art method (68.1%).