Abstract
Privacy-preserving clustering has received increasing research attention in recent years. Local differential privacy (LDP) is a privacy model without relying on trusted third parties. It plays a crucial role in distributed privacy-preserving clustering. Most previous methods focus on partition-based clustering (e.g., k-means), which necessitates many iterative interactions. Massive iterations cause a division of the privacy budget and increase the amount of individual noise, affecting the clustering utility. To address this issue, we turn to grid-based clustering and design the GC-LDP algorithm to balance the privacy and clustering utility with only three rounds of interactions. In GC-LDP, our core contribution is to develop a non-uniform grid division method via the coefficient of variation (CV), which can generate a grid structure approximating the global data distribution within two rounds of interactions. Besides, we designed a new perturbation mechanism to reduce the amount of individual noise injection. Further, a cell aggregation method is developed by exploring the relative density difference among grids to achieve multi-density clustering. Theoretical analysis and experiments on real-world datasets show that GC-LDP can obtain high-quality clustering results while satisfying local differential privacy.
| Original language | English |
|---|---|
| Pages (from-to) | 402-420 |
| Number of pages | 19 |
| Journal | Information Sciences |
| Volume | 623 |
| DOIs | |
| Publication status | Published - Apr 2023 |
Keywords
- Grid-based clustering
- Iteration rounds
- Local differential privacy
ASJC Scopus subject areas
- Software
- Control and Systems Engineering
- Theoretical Computer Science
- Computer Science Applications
- Information Systems and Management
- Artificial Intelligence