Multidimensional grid-based clustering with local differential privacy

Nan Fu, Weiwei Ni, Haibo Hu, Sen Zhang

Research output: Journal article publicationJournal articleAcademic researchpeer-review

13 Citations (Scopus)

Abstract

Privacy-preserving clustering has received increasing research attention in recent years. Local differential privacy (LDP) is a privacy model without relying on trusted third parties. It plays a crucial role in distributed privacy-preserving clustering. Most previous methods focus on partition-based clustering (e.g., k-means), which necessitates many iterative interactions. Massive iterations cause a division of the privacy budget and increase the amount of individual noise, affecting the clustering utility. To address this issue, we turn to grid-based clustering and design the GC-LDP algorithm to balance the privacy and clustering utility with only three rounds of interactions. In GC-LDP, our core contribution is to develop a non-uniform grid division method via the coefficient of variation (CV), which can generate a grid structure approximating the global data distribution within two rounds of interactions. Besides, we designed a new perturbation mechanism to reduce the amount of individual noise injection. Further, a cell aggregation method is developed by exploring the relative density difference among grids to achieve multi-density clustering. Theoretical analysis and experiments on real-world datasets show that GC-LDP can obtain high-quality clustering results while satisfying local differential privacy.

Original languageEnglish
Pages (from-to)402-420
Number of pages19
JournalInformation Sciences
Volume623
DOIs
Publication statusPublished - Apr 2023

Keywords

  • Grid-based clustering
  • Iteration rounds
  • Local differential privacy

ASJC Scopus subject areas

  • Software
  • Control and Systems Engineering
  • Theoretical Computer Science
  • Computer Science Applications
  • Information Systems and Management
  • Artificial Intelligence

Fingerprint

Dive into the research topics of 'Multidimensional grid-based clustering with local differential privacy'. Together they form a unique fingerprint.

Cite this