Preserving User Privacy For Machine Learning: Local Differential Privacy or Federated Machine Learning

Huadi Zheng, Haibo Hu, Ziyang Han

Research output: Journal article publicationJournal articleAcademic researchpeer-review

1 Citation (Scopus)

Abstract

The growing number of mobile and IoT devices has nourished many intelligent applications. In order to produce high-quality machine learning models, they constantly access and collect rich personal data such as photos, browsing history and text messages. However, direct access to personal data has raised increasing public concerns about privacy risks and security breaches. To address these concerns, there are two emerging solutions to privacy-preserving machine learning, namely local differential privacy and federated machine learning. The former is a distributed data collection strategy where each client perturbs data locally before submitting to the server, whereas the latter is a distributed machine learning strategy to train models on mobile devices locally and merge their output (e.g., parameter updates of a model) through a control protocol. In this paper, we conduct a comparative study on the efficiency and privacy of both solutions. Our results show that in a standard population and domain setting, both can achieve an optimal misclassification rate lower than 20% and federated machine learning generally performs better at the cost of higher client CPU usage. Nonetheless, local differential privacy can benefit more from a larger client population (> 1k). As for privacy guarantee, local differential privacy also has flexible control over the data leakage.

Original languageEnglish
JournalIEEE Intelligent Systems
DOIs
Publication statusAccepted/In press - Aug 2020

Keywords

  • Data models
  • Distributed databases
  • Federated Machine Learning
  • Local Differential Privacy
  • Machine learning
  • Privacy
  • Servers

ASJC Scopus subject areas

  • Computer Networks and Communications
  • Artificial Intelligence

Cite this