Binary Independence Language Model in a Relevance Feedback Environment

H. C. Wu, R. W.P. Luk, K. F. Wong, J. Y. Nie

Research output: Journal article publicationJournal articleAcademic researchpeer-review

Abstract

Model construction is a kind of knowledge engineering, and building retrieval models is critical to the success of search engines. This article proposes a new (retrieval) language model, called binary independence language model (BILM). It integrates two document-context based language models together into one by the log-odds ratio where these two are language models applied to describe document-contexts of query terms. One model is based on relevance information while the other is based on the non-relevance information. Each model incorporates link dependencies and multiple query term dependencies. The probabilities are interpolated between the relative frequency and the background probabilities. In a simulated relevance feedback environment of top 20 judged documents, our BILM performed statistically significantly better than the other highly effective retrieval models at 95% confidence level across four TREC collections using fixed parameter values for the mean average precision. For the less stable performance measure (i.e. precision at the top 10), no statistical significance is shown between the different models for the individual test collections although numerically our BILM is better than two other models with a confidence level of 95% based on a paired sign test across the test collections of both relevance feedback and retrospective experiments.

Original languageEnglish
Pages (from-to)873-895
Number of pages23
JournalInternational Journal of Software Engineering and Knowledge Engineering
Volume29
Issue number6
DOIs
Publication statusPublished - 1 Jun 2019

Keywords

  • Information retrieval
  • language model
  • proximity matching

ASJC Scopus subject areas

  • Software
  • Computer Networks and Communications
  • Computer Graphics and Computer-Aided Design
  • Artificial Intelligence

Cite this