An Efficient Approach for Super and Nested Term Indexing and Retrieval
This work addresses a domain-specific problem in information retrieval for handling complex term structures, presenting an incremental improvement over existing methods.
The paper tackles the problem of efficiently indexing and retrieving nested and super terms by proposing Terminological Bucket Indexing (TBI), a hybrid data structure that achieves comparable performance to Trie-based methods for nested terms and far superior performance for super terms, while reducing indexing time by 80% compared to traditional hash tables.
This paper describes a new approach, called Terminological Bucket Indexing (TBI), for efficient indexing and retrieval of both nested and super terms using a single method. We propose a hybrid data structure for facilitating faster indexing building. An evaluation of our approach with respect to widely used existing approaches on several publicly available dataset is provided. Compared to Trie based approaches, TBI provides comparable performance on nested term retrieval and far superior performance on super term retrieval. Compared to traditional hash table, TBI needs 80\% less time for indexing.