H-Rank: A keywords extraction method from web pages using POS tags
- Resource Type
- Conference
- Authors
- Shah, Himat; Khan, Muhammad U. S.; Franti, Pasi
- Source
- 2019 IEEE 17th International Conference on Industrial Informatics (INDIN) Industrial Informatics (INDIN), 2019 IEEE 17th International Conference on. 1:264-269 Jul, 2019
- Subject
- Components, Circuits, Devices and Systems
Computing and Processing
Power, Energy and Industry Applications
Robotics and Control Systems
Signal Processing and Analysis
Agglomerative clustering
POS tags
Web pages
- Language
- ISSN
- 2378-363X
We present a new keywords extraction method that applies the semantic similarity among the frequent words on the web page along with the distribution of POS tags. We apply hierarchical clustering to cluster the semantically similar words that have more coverage of the content of the web page. Our method shows better performance than CL-Rank and other existing methodologies.