TY - GEN
T1 - TrojanZoo: Towards Unified, Holistic, and Practical Evaluation of Neural Backdoors
AU - Pang, Ren
AU - Zhang, Zheng
AU - Gao, Xiangshan
AU - Xi, Zhaohan
AU - Ji, Shouling
AU - Cheng, Peng
AU - Luo, Xiapu
AU - Wang, Ting
N1 - Funding Information:
We thank anonymous reviewers and shepherd for valuable feedback. This work is supported by the National Science Foundation under Grant No. 1951729, 1953813, and 1953893. Any opinions, findings, and conclusions or recommendations are those of the authors and do not necessarily reflect the views of the National Science Foundation. X. Luo is partly supported by Hong Kong RGC Project (No. PolyU15222320).
Publisher Copyright:
© 2022 IEEE.
PY - 2022/6/23
Y1 - 2022/6/23
N2 - Neural backdoors represent one primary threat to the security of deep learning systems. The intensive research has produced a plethora of backdoor attacks/defenses, resulting in a constant arms race. However, due to the lack of evaluation benchmarks, many critical questions remain under-explored: (i) what are the strengths and limitations of different attacks/defenses? (ii) what are the best practices to operate them? and (iii) how can the existing attacks/defenses be further improved? To bridge this gap, we design and implement TROJAN-ZOO, the first open-source platform for evaluating neural backdoor attacks/defenses in a unified, holistic, and practical manner. Thus far, focusing on the computer vision domain, it has incorporated 8 representative attacks, 14 state-of-the-art defenses, 6 attack performance metrics, 10 defense utility metrics, as well as rich tools for in-depth analysis of the attack-defense interactions. Leveraging TROJANZOO, we conduct a systematic study on the existing attacks/defenses, unveiling their complex design spectrum: both manifest intricate trade-offs among multiple desiderata (e.g., the effectiveness, evasiveness, and transferability of attacks). We further explore improving the existing attacks/defenses, leading to a number of interesting findings: (i) one-pixel triggers often suffice; (ii) training from scratch often outperforms perturbing benign models to craft trojan models; (iii) optimizing triggers and trojan models jointly greatly improves both attack effectiveness and evasiveness; (iv) individual defenses can often be evaded by adaptive attacks; and (v) exploiting model interpretability significantly improves defense robustness. We envision that TROJANZOO will serve as a valuable platform to facilitate future research on neural backdoors.
AB - Neural backdoors represent one primary threat to the security of deep learning systems. The intensive research has produced a plethora of backdoor attacks/defenses, resulting in a constant arms race. However, due to the lack of evaluation benchmarks, many critical questions remain under-explored: (i) what are the strengths and limitations of different attacks/defenses? (ii) what are the best practices to operate them? and (iii) how can the existing attacks/defenses be further improved? To bridge this gap, we design and implement TROJAN-ZOO, the first open-source platform for evaluating neural backdoor attacks/defenses in a unified, holistic, and practical manner. Thus far, focusing on the computer vision domain, it has incorporated 8 representative attacks, 14 state-of-the-art defenses, 6 attack performance metrics, 10 defense utility metrics, as well as rich tools for in-depth analysis of the attack-defense interactions. Leveraging TROJANZOO, we conduct a systematic study on the existing attacks/defenses, unveiling their complex design spectrum: both manifest intricate trade-offs among multiple desiderata (e.g., the effectiveness, evasiveness, and transferability of attacks). We further explore improving the existing attacks/defenses, leading to a number of interesting findings: (i) one-pixel triggers often suffice; (ii) training from scratch often outperforms perturbing benign models to craft trojan models; (iii) optimizing triggers and trojan models jointly greatly improves both attack effectiveness and evasiveness; (iv) individual defenses can often be evaded by adaptive attacks; and (v) exploiting model interpretability significantly improves defense robustness. We envision that TROJANZOO will serve as a valuable platform to facilitate future research on neural backdoors.
KW - backdoor attack
KW - backdoor defense
KW - benchmark platform
KW - deep learning security
UR - http://www.scopus.com/inward/record.url?scp=85134050326&partnerID=8YFLogxK
U2 - 10.1109/EuroSP53844.2022.00048
DO - 10.1109/EuroSP53844.2022.00048
M3 - Conference article published in proceeding or book
T3 - Proceedings - 7th IEEE European Symposium on Security and Privacy, Euro S and P 2022
SP - 684
EP - 702
BT - Proceedings - 7th IEEE European Symposium on Security and Privacy, Euro S and P 2022
PB - IEEE
ER -