Optimizing quality for probabilistic skyline computation and probabilistic similarity search (Extended Abstract)

Xiaoye Miao, Yunjun Gao, Linlin Zhou, Wei Wang, Qing Li

Research output: Chapter in book / Conference proceedingConference article published in proceeding or bookAcademic researchpeer-review

Abstract

Probabilistic queries usually suffer from the noisy query result sets, due to data uncertainty. In this paper, we propose an efficient optimization framework, termed as QueryClean, for both probabilistic skyline computation and probabilistic similarity search. Its goal is to optimize query quality by selecting a group of uncertain objects to clean under limited resource available, where an entropy based quality function is leveraged. We develop an efficient index to organize the possible result sets of probabilistic queries, which is able to help avoid multiple probabilistic query evaluations over a large number of possible worlds for quality computation. Moreover, using two newly presented heuristics, we present exact and approximate algorithms for the optimization problem. Extensive experiments on both real and synthetic data sets demonstrate the efficiency and scalability of QueryClean.

Original languageEnglish
Title of host publicationProceedings - 2019 IEEE 35th International Conference on Data Engineering, ICDE 2019
PublisherIEEE Computer Society
Pages2129-2130
Number of pages2
ISBN (Electronic)9781538674741
DOIs
Publication statusPublished - Apr 2019
Event35th IEEE International Conference on Data Engineering, ICDE 2019 - Macau, China
Duration: 8 Apr 201911 Apr 2019

Publication series

NameProceedings - International Conference on Data Engineering
Volume2019-April
ISSN (Print)1084-4627

Conference

Conference35th IEEE International Conference on Data Engineering, ICDE 2019
Country/TerritoryChina
CityMacau
Period8/04/1911/04/19

Keywords

  • Optimization algorithms
  • Probabilistic similarity query
  • Probabilistic skyline query
  • Query quality

ASJC Scopus subject areas

  • Software
  • Signal Processing
  • Information Systems

Cite this