TY - GEN
T1 - Federated Learning with GAN-Based Data Synthesis for Non-IID Clients
AU - Li, Zijian
AU - Shao, Jiawei
AU - Mao, Yuyi
AU - Wang, Jessie Hui
AU - Zhang, Jun
N1 - Publisher Copyright:
© 2023, The Author(s), under exclusive license to Springer Nature Switzerland AG.
PY - 2023
Y1 - 2023
N2 - Federated learning (FL) has recently emerged as a popular privacy-preserving collaborative learning paradigm. However, it suffers from the non-independent and identically distributed (non-IID) data among clients. In this chapter, we propose a novel framework, named Synthetic Data Aided Federated Learning (SDA-FL), to resolve this non-IID challenge by sharing synthetic data. Specifically, each client pretrains a local generative adversarial network (GAN) to generate differentially private synthetic data, which are uploaded to the parameter server (PS) to construct a global shared synthetic dataset. To generate confident pseudo labels for the synthetic dataset, we also propose an iterative pseudo labeling mechanism performed by the PS. The assistance of the synthetic dataset with confident pseudo labels significantly alleviates the data heterogeneity among clients, which improves the consistency among local updates and benefits the global aggregation. Extensive experiments evidence that the proposed framework outperforms the baseline methods by a large margin in several benchmark datasets under both the supervised and semi-supervised settings.
AB - Federated learning (FL) has recently emerged as a popular privacy-preserving collaborative learning paradigm. However, it suffers from the non-independent and identically distributed (non-IID) data among clients. In this chapter, we propose a novel framework, named Synthetic Data Aided Federated Learning (SDA-FL), to resolve this non-IID challenge by sharing synthetic data. Specifically, each client pretrains a local generative adversarial network (GAN) to generate differentially private synthetic data, which are uploaded to the parameter server (PS) to construct a global shared synthetic dataset. To generate confident pseudo labels for the synthetic dataset, we also propose an iterative pseudo labeling mechanism performed by the PS. The assistance of the synthetic dataset with confident pseudo labels significantly alleviates the data heterogeneity among clients, which improves the consistency among local updates and benefits the global aggregation. Extensive experiments evidence that the proposed framework outperforms the baseline methods by a large margin in several benchmark datasets under both the supervised and semi-supervised settings.
KW - Federated Learning
KW - Generative Adversarial Network (GAN)
KW - Non-Independent and Identically Distributed (non-IID) Problem
UR - http://www.scopus.com/inward/record.url?scp=85152552368&partnerID=8YFLogxK
U2 - 10.1007/978-3-031-28996-5_2
DO - 10.1007/978-3-031-28996-5_2
M3 - Conference article published in proceeding or book
AN - SCOPUS:85152552368
SN - 9783031289958
T3 - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
SP - 17
EP - 32
BT - Trustworthy Federated Learning - First International Workshop, FL 2022, Held in Conjunction with IJCAI 2022, Revised Selected Papers
A2 - Goebel, Randy
A2 - Yu, Han
A2 - Faltings, Boi
A2 - Fan, Lixin
A2 - Xiong, Zehui
PB - Springer Science and Business Media Deutschland GmbH
T2 - 1st International Workshop on Trustworthy Federated Learning in Conjunction with International Joint Conference on AI, FL-IJCAI 2022
Y2 - 23 July 2022 through 23 July 2022
ER -