TY - JOUR
T1 - Protecting Decision Boundary of Machine Learning Model With Differentially Private Perturbation
AU - Zheng, Huadi
AU - Ye, Qingqing
AU - Hu, Haibo
AU - Fang, Chengfang
AU - Shi, Jie
N1 - This work was supported by National Natural Science Foundation of China (Grant No: U1636205 and 61572413), the Research Grants Council, Hong Kong SAR, China
(Grant No: 15238116, 15222118, 15218919, and C1008-16G), and a Research Project from Huawei.
Publisher Copyright:
© 2004-2012 IEEE.
PY - 2022/5
Y1 - 2022/5
N2 - Machine learning service API allows model owners to monetize proprietary models by offering prediction services to third-party users. However, existing literature shows that model parameters are vulnerable to extraction attacks which accumulate prediction queries and their responses to train a replica model. As countermeasures, researchers have proposed to reduce the rich API output, such as hiding the precise confidence. Nonetheless, even with response being only one bit, an adversary can still exploit fine-tuned queries with differential property to infer the decision boundary of the underlying model. In this article, we propose boundary differential privacy (BDP) against such attacks by obfuscating the prediction responses with noises. BDP guarantees an adversary cannot learn the decision boundary of any two classes by a predefined precision no matter how many queries are issued to the prediction API. We first design a perturbation algorithm called boundary randomized response for a binary model. Then we prove it satisfies ϵ-BDP, followed by a generalization of this algorithm to a multiclass model. Finally, we generalize a hard boundary to soft boundary and design an adaptive perturbation algorithm that can still work in the latter case. The effectiveness and high utility of our solution are verified by extensive experiments on both linear and non-linear models.
AB - Machine learning service API allows model owners to monetize proprietary models by offering prediction services to third-party users. However, existing literature shows that model parameters are vulnerable to extraction attacks which accumulate prediction queries and their responses to train a replica model. As countermeasures, researchers have proposed to reduce the rich API output, such as hiding the precise confidence. Nonetheless, even with response being only one bit, an adversary can still exploit fine-tuned queries with differential property to infer the decision boundary of the underlying model. In this article, we propose boundary differential privacy (BDP) against such attacks by obfuscating the prediction responses with noises. BDP guarantees an adversary cannot learn the decision boundary of any two classes by a predefined precision no matter how many queries are issued to the prediction API. We first design a perturbation algorithm called boundary randomized response for a binary model. Then we prove it satisfies ϵ-BDP, followed by a generalization of this algorithm to a multiclass model. Finally, we generalize a hard boundary to soft boundary and design an adaptive perturbation algorithm that can still work in the latter case. The effectiveness and high utility of our solution are verified by extensive experiments on both linear and non-linear models.
KW - Adversarial machine learning
KW - Boundary differential privacy
KW - Model defense
KW - Model extraction
UR - http://www.scopus.com/inward/record.url?scp=85097959198&partnerID=8YFLogxK
U2 - 10.1109/TDSC.2020.3043382
DO - 10.1109/TDSC.2020.3043382
M3 - Journal article
AN - SCOPUS:85097959198
SN - 1545-5971
VL - 19
SP - 2007
EP - 2022
JO - IEEE Transactions on Dependable and Secure Computing
JF - IEEE Transactions on Dependable and Secure Computing
IS - 3
ER -