Multi-dimensional top-k dominating queries

Man Lung Yiu, Nikos Mamoulis

Research output: Journal article publicationJournal articleAcademic researchpeer-review

80 Citations (Scopus)

Abstract

The top-k dominating query returns k data objects which dominate the highest number of objects in a dataset. This query is an important tool for decision support since it provides data analysts an intuitive way for finding significant objects. In addition, it combines the advantages of top-k and skyline queries without sharing their disadvantages: (i) the output size can be controlled, (ii) no ranking functions need to be specified by users, and (iii) the result is independent of the scales at different dimensions. Despite their importance, top-k dominating queries have not received adequate attention from the research community. This paper is an extensive study on the evaluation of top-k dominating queries. First, we propose a set of algorithms that apply on indexed multi-dimensional data. Second, we investigate query evaluation on data that are not indexed. Finally, we study a relaxed variant of the query which considers dominance in dimensional subspaces. Experiments using synthetic and real datasets demonstrate that our algorithms significantly outperform a previous skyline-based approach. We also illustrate the applicability of this multi-dimensional analysis query by studying the meaningfulness of its results on real data.
Original languageEnglish
Pages (from-to)695-718
Number of pages24
JournalVLDB Journal
Volume18
Issue number3
DOIs
Publication statusPublished - 1 Jun 2009
Externally publishedYes

Keywords

  • Preference dominance
  • Score counting
  • Top-k retrieval

ASJC Scopus subject areas

  • Information Systems
  • Hardware and Architecture

Cite this