TY - JOUR
T1 - PriPL-Tree: Accurate Range Query for Arbitrary Distribution under Local Differential Privacy
AU - Wang, Leixia
AU - Ye, Qingqing
AU - Hu, Haibo
AU - Meng, Xiaofeng
N1 - Publisher Copyright:
© 2024, VLDB Endowment. All rights reserved.
PY - 2024/8
Y1 - 2024/8
N2 - Answering range queries in the context of Local Differential Privacy (LDP) is a widely studied problem in Online Analytical Processing (OLAP). Existing LDP solutions all assume a uniform data distribution within each domain partition, which may not align with real-world scenarios where data distribution is varied, resulting in inaccurate estimates. To address this problem, we introduce PriPL-Tree, a novel data structure that combines hierarchical tree structures with piecewise linear (PL) functions to answer range queries for arbitrary distributions. PriPL-Tree precisely models the underlying data distribution with a few line segments, leading to more accurate results for range queries. Furthermore, we extend it to multi-dimensional cases with novel data-aware adaptive grids. These grids leverage the insights from marginal distributions obtained through PriPL-Trees to partition the grids adaptively, adapting the density of underlying distributions. Our extensive experiments on both real and synthetic datasets demonstrate the effective ness and superiority of PriPL-Tree over state-of-the-art solutions in answering range queries across arbitrary data distributions.
AB - Answering range queries in the context of Local Differential Privacy (LDP) is a widely studied problem in Online Analytical Processing (OLAP). Existing LDP solutions all assume a uniform data distribution within each domain partition, which may not align with real-world scenarios where data distribution is varied, resulting in inaccurate estimates. To address this problem, we introduce PriPL-Tree, a novel data structure that combines hierarchical tree structures with piecewise linear (PL) functions to answer range queries for arbitrary distributions. PriPL-Tree precisely models the underlying data distribution with a few line segments, leading to more accurate results for range queries. Furthermore, we extend it to multi-dimensional cases with novel data-aware adaptive grids. These grids leverage the insights from marginal distributions obtained through PriPL-Trees to partition the grids adaptively, adapting the density of underlying distributions. Our extensive experiments on both real and synthetic datasets demonstrate the effective ness and superiority of PriPL-Tree over state-of-the-art solutions in answering range queries across arbitrary data distributions.
UR - https://www.scopus.com/pages/publications/85205394550
U2 - 10.14778/3681954.3681981
DO - 10.14778/3681954.3681981
M3 - Conference article
AN - SCOPUS:85205394550
SN - 2150-8097
VL - 17
SP - 3031
EP - 3044
JO - Proceedings of the VLDB Endowment
JF - Proceedings of the VLDB Endowment
IS - 11
ER -