Scientific document retrieval using structure encoded string with trie indexing.
- Resource Type
- Article
- Authors
- Dhar, Sourish; Roy, Sudipta; Paul, Arnab,
- Source
- Information Services & Use; 2022, Vol. 42 Issue 3/4, p241-259, 19p, 2 Diagrams, 8 Charts, 9 Graphs
- Subject
- Information retrieval
Indexing
Mathematical formulas
Encoding
- Language
- ISSN
- 01675265
Retrieving mathematical expressions from scientific documents is a challenging task as mathematical expressions or formulae are quite different from the traditional text. Mathematical expressions are highly symbolic and complex. Moreover, the structure of a mathematical formula conveys a semantic meaning which cannot be overlooked. This paper proposes a scientific document retrieval system based on mathematical formula query. The paper explores the concept of Structure Encoded String (SES), which has been employed for mathematical expressions to capture the relations among the formula structures. A pattern based trie indexing scheme has been proposed for faster retrieval. The Jaro-Winkler Similarity has been adopted for matching and ranking. Experiments are conducted, results are reported using standard evaluation measures and compared with similar existing systems. [ABSTRACT FROM AUTHOR]