On event space and rank equivalence between probabilistic retrieval models

Wing Pong Robert Luk

Research output: Journal article publicationJournal articleAcademic researchpeer-review

13 Citations (Scopus)

Abstract

This paper discusses various issues about the rank equivalence of Lafferty and Zhai between the log-odds ratio and the query likelihood of probabilistic retrieval models. It highlights that Robertson's concerns about this equivalence may arise when multiple probability distributions are assumed to be uniformly distributed, after assuming that the marginal probability logically follows from Kolmogorov's probability axioms. It also clarifies that there are two types of rank equivalence relations between probabilistic models, namely strict and weak rank equivalence. This paper focuses on the strict rank equivalence which requires the event spaces of the participating probabilistic models to be identical. It is possible that two probabilistic models are strict rank equivalent when they use different probability estimation methods. This paper shows that the query likelihood, p(q|d, r), is strict rank equivalent to p(q|d) of the language model of Ponte and Croft by applying assumptions 1 and 2 of Lafferty and Zhai. In addition, some statistical component language model may be strict rank equivalent to the log-odds ratio, and that some statistical component model using the log-odds ratio may be strict rank equivalent to the query likelihood. Finally, we suggest adding a random variable for the user information need to the probabilistic retrieval models for clarification when these models deal with multiple requests.
Original languageEnglish
Pages (from-to)539-561
Number of pages23
JournalInformation Retrieval
Volume11
Issue number6
DOIs
Publication statusPublished - 1 Dec 2008

Keywords

  • Event space
  • Information retrieval
  • Probabilistic models

ASJC Scopus subject areas

  • Information Systems
  • Library and Information Sciences

Fingerprint

Dive into the research topics of 'On event space and rank equivalence between probabilistic retrieval models'. Together they form a unique fingerprint.

Cite this