Understanding and Tackling Label Errors in Deep Learning-Based Vulnerability Detection (Experience Paper)

Xu Nie, Ningke Li, Kailong Wang, Shangguang Wang, Xiapu Luo, Haoyu Wang

Research output: Chapter in book / Conference proceedingConference article published in proceeding or bookAcademic researchpeer-review

10 Citations (Scopus)

Abstract

Software system complexity and security vulnerability diversity are plausible sources of the persistent challenges in software vulnerability research. Applying deep learning methods for automatic vulnerability detection has been proven an effective means to complement traditional detection approaches. Unfortunately, lacking well-qualified benchmark datasets could critically restrict the effectiveness of deep learning-based vulnerability detection techniques. Specifically, the long-term existence of erroneous labels in the existing vulnerability datasets may lead to inaccurate, biased, and even flawed results.

In this paper, we aim to obtain an in-depth understanding and explanation of the label error causes. To this end, we systematically analyze the diversified datasets used by state-of-the-art learning-based vulnerability detection approaches, and examine their techniques for collecting vulnerable source code datasets. We find that label errors heavily impact the mainstream vulnerability detection models, with a worst-case average F1 drop of 20.7%. As mitigation, we introduce two approaches to dataset denoising, which will enhance the model performance by an average of 10.4%. Leveraging dataset denoising methods, we provide a feasible solution to obtain high-quality labeled datasets.
Original languageEnglish
Title of host publicationProceedings of the 32nd ACM SIGSOFT International Symposium on Software Testing and Analysis
Pages52–63
Publication statusPublished - Jul 2023

Fingerprint

Dive into the research topics of 'Understanding and Tackling Label Errors in Deep Learning-Based Vulnerability Detection (Experience Paper)'. Together they form a unique fingerprint.

Cite this