TY - GEN
T1 - Optimizing quality for probabilistic skyline computation and probabilistic similarity search (Extended Abstract)
AU - Miao, Xiaoye
AU - Gao, Yunjun
AU - Zhou, Linlin
AU - Wang, Wei
AU - Li, Qing
PY - 2019/4
Y1 - 2019/4
N2 - Probabilistic queries usually suffer from the noisy query result sets, due to data uncertainty. In this paper, we propose an efficient optimization framework, termed as QueryClean, for both probabilistic skyline computation and probabilistic similarity search. Its goal is to optimize query quality by selecting a group of uncertain objects to clean under limited resource available, where an entropy based quality function is leveraged. We develop an efficient index to organize the possible result sets of probabilistic queries, which is able to help avoid multiple probabilistic query evaluations over a large number of possible worlds for quality computation. Moreover, using two newly presented heuristics, we present exact and approximate algorithms for the optimization problem. Extensive experiments on both real and synthetic data sets demonstrate the efficiency and scalability of QueryClean.
AB - Probabilistic queries usually suffer from the noisy query result sets, due to data uncertainty. In this paper, we propose an efficient optimization framework, termed as QueryClean, for both probabilistic skyline computation and probabilistic similarity search. Its goal is to optimize query quality by selecting a group of uncertain objects to clean under limited resource available, where an entropy based quality function is leveraged. We develop an efficient index to organize the possible result sets of probabilistic queries, which is able to help avoid multiple probabilistic query evaluations over a large number of possible worlds for quality computation. Moreover, using two newly presented heuristics, we present exact and approximate algorithms for the optimization problem. Extensive experiments on both real and synthetic data sets demonstrate the efficiency and scalability of QueryClean.
KW - Optimization algorithms
KW - Probabilistic similarity query
KW - Probabilistic skyline query
KW - Query quality
UR - http://www.scopus.com/inward/record.url?scp=85067926574&partnerID=8YFLogxK
U2 - 10.1109/ICDE.2019.00259
DO - 10.1109/ICDE.2019.00259
M3 - Conference article published in proceeding or book
AN - SCOPUS:85067926574
T3 - Proceedings - International Conference on Data Engineering
SP - 2129
EP - 2130
BT - Proceedings - 2019 IEEE 35th International Conference on Data Engineering, ICDE 2019
PB - IEEE Computer Society
T2 - 35th IEEE International Conference on Data Engineering, ICDE 2019
Y2 - 8 April 2019 through 11 April 2019
ER -