IRJul 30, 2015

Generalized Ensemble Model for Document Ranking in Information Retrieval

Yanshan Wang, In-Chan Choi, Hongfang Liu

arXiv:1507.08586v313 citations

Originality Incremental advance

AI Analysis

This work addresses document ranking for information retrieval applications, presenting an incremental improvement through ensemble methods.

The paper tackles the problem of document ranking in information retrieval by proposing a generalized ensemble model (gEnM) that linearly combines basis retrieval models to optimize mean average precision, with experimental verification on benchmark datasets showing its effectiveness.

A generalized ensemble model (gEnM) for document ranking is proposed in this paper. The gEnM linearly combines basis document retrieval models and tries to retrieve relevant documents at high positions. In order to obtain the optimal linear combination of multiple document retrieval models or rankers, an optimization program is formulated by directly maximizing the mean average precision. Both supervised and unsupervised learning algorithms are presented to solve this program. For the supervised scheme, two approaches are considered based on the data setting, namely batch and online setting. In the batch setting, we propose a revised Newton's algorithm, gEnM.BAT, by approximating the derivative and Hessian matrix. In the online setting, we advocate a stochastic gradient descent (SGD) based algorithm---gEnM.ON. As for the unsupervised scheme, an unsupervised ensemble model (UnsEnM) by iteratively co-learning from each constituent ranker is presented. Experimental study on benchmark data sets verifies the effectiveness of the proposed algorithms. Therefore, with appropriate algorithms, the gEnM is a viable option in diverse practical information retrieval applications.

View on arXiv PDF

Similar