Local Term Weight Models from Power Transformations: Development of BM25IR: A Best Match Model based on Inverse Regression
This work is incremental, addressing term weighting in information retrieval for researchers and practitioners.
The paper tackled the problem of deriving local term weights for information retrieval by showing that power transformations can unify BM25 and inverse regression, with BM25IR performing well across various conditions and document lengths in simulations.
In this article we show how power transformations can be used as a common framework for the derivation of local term weights. We found that under some parametric conditions, BM25 and inverse regression produce equivalent results. As a special case of inverse regression, we show that the largest increment in term weight occurs when a term is mentioned for the second time. A model based on inverse regression (BM25IR) is presented. Simulations suggest that BM25IR works fairly well for different BM25 parametric conditions and document lengths.