Reducing uncertainty of probabilistic top-k ranking via pairwise crowdsourcing

Xin Lin, Jianliang Xu, Haibo Hu, Zhe Fan

Research output: Journal article publicationJournal articleAcademic researchpeer-review

4 Citations (Scopus)

Abstract

Probabilistic top-k ranking is an important and well-studied query operator in uncertain databases. However, the quality of top- k results might be heavily affected by the ambiguity and uncertainty of the underlying data. Uncertainty reduction techniques have been proposed to improve the quality of top- k results by cleaning the original data. Unfortunately, most data cleaning models aim to probe the exact values of the objects individually and therefore do not work well for subjective data types, such as user ratings, which are inherently probabilistic. In this paper, we propose a novel pairwise crowdsourcing model to reduce the uncertainty of top-k ranking using a crowd of domain experts. Given a crowdsourcing task of limited budget, we propose efficient algorithms to select the best object pairs for crowdsourcing that will bring in the highest quality improvement. Extensive experiments show that our proposed solutions outperform a random selection method by up to 30 times in terms of quality improvement of probabilistic top- k ranking queries. In terms of efficiency, our proposed solutions can reduce the elapsed time of a brute-force algorithm from several days to one minute.
Original languageEnglish
Article number7954652
Pages (from-to)2290-2303
Number of pages14
JournalIEEE Transactions on Knowledge and Data Engineering
Volume29
Issue number10
DOIs
Publication statusPublished - 1 Oct 2017

Keywords

  • Crowdsourcing
  • Top-k ranking
  • Uncertain data management

ASJC Scopus subject areas

  • Information Systems
  • Computer Science Applications
  • Computational Theory and Mathematics

Cite this