TY - GEN
T1 - Automated Third-Party Library Detection for Android Applications: Are We There Yet?
AU - Zhan, Xian
AU - Fan, Lingling
AU - Liu, Tianming
AU - Chen, Sen
AU - Li, Li
AU - Wang, Haoyu
AU - Xu, Yifei
AU - Luo, Xiapu
AU - Liu, Yang
N1 - Funding Information:
We thank the anonymous reviewers for their helpful comments. This work is partly supported by the National Natural Science Foundation of China (No. 61702045), the Australian Research Council (ARC) under projects DE200100016 and DP200100020, the Hong Kong RGC Projects (No. 152223/17E, 152239/18E), the Hong Kong PhD Fellowship Scheme and the Singapore National Research Foundation under NCR Award Number NRF2018NCR-NSOE004-0001.
Publisher Copyright:
© 2020 ACM.
PY - 2020/9
Y1 - 2020/9
N2 - Third-party libraries (TPLs) have become a significant part of the Android ecosystem. Developers can employ various TPLs with different functionalities to facilitate their app development. Unfortunately, the popularity of TPLs also brings new challenges and even threats. TPLs may carry malicious or vulnerable code, which can infect popular apps to pose threats to mobile users. Besides, the code of third-party libraries could constitute noises in some downstream tasks (e.g., malware and repackaged app detection). Thus, researchers have developed various tools to identify TPLs. However, no existing work has studied these TPL detection tools in detail; different tools focus on different applications with performance differences, but little is known about them. To better understand existing TPL detection tools and dissect TPL detection techniques, we conduct a comprehensive empirical study to fill the gap by evaluating and comparing all publicly available TPL detection tools based on four criteria: effectiveness, efficiency, code obfuscation-resilience capability, and ease of use. We reveal their advantages and disadvantages based on a systematic and thorough empirical study. Furthermore, we also conduct a user study to evaluate the usability of each tool. The results showthat LibScout outperforms others regarding effectiveness, LibRadar takes less time than others and is also regarded as the most easy-to-use one, and LibPecker performs the best in defending against code obfuscation techniques. We further summarize the lessons learned from different perspectives, including users, tool implementation, and researchers. Besides, we enhance these open-sourced tools by fixing their limitations to improve their detection ability. We also build an extensible framework that integrates all existing available TPL detection tools, providing online service for the research community. We make publicly available the evaluation dataset and enhanced tools. We believe our work provides a clear picture of existing TPL detection techniques and also give a road-map for future directions.
AB - Third-party libraries (TPLs) have become a significant part of the Android ecosystem. Developers can employ various TPLs with different functionalities to facilitate their app development. Unfortunately, the popularity of TPLs also brings new challenges and even threats. TPLs may carry malicious or vulnerable code, which can infect popular apps to pose threats to mobile users. Besides, the code of third-party libraries could constitute noises in some downstream tasks (e.g., malware and repackaged app detection). Thus, researchers have developed various tools to identify TPLs. However, no existing work has studied these TPL detection tools in detail; different tools focus on different applications with performance differences, but little is known about them. To better understand existing TPL detection tools and dissect TPL detection techniques, we conduct a comprehensive empirical study to fill the gap by evaluating and comparing all publicly available TPL detection tools based on four criteria: effectiveness, efficiency, code obfuscation-resilience capability, and ease of use. We reveal their advantages and disadvantages based on a systematic and thorough empirical study. Furthermore, we also conduct a user study to evaluate the usability of each tool. The results showthat LibScout outperforms others regarding effectiveness, LibRadar takes less time than others and is also regarded as the most easy-to-use one, and LibPecker performs the best in defending against code obfuscation techniques. We further summarize the lessons learned from different perspectives, including users, tool implementation, and researchers. Besides, we enhance these open-sourced tools by fixing their limitations to improve their detection ability. We also build an extensible framework that integrates all existing available TPL detection tools, providing online service for the research community. We make publicly available the evaluation dataset and enhanced tools. We believe our work provides a clear picture of existing TPL detection techniques and also give a road-map for future directions.
KW - Android
KW - Empirical study
KW - Library detection
KW - Third-party library
UR - http://www.scopus.com/inward/record.url?scp=85099211873&partnerID=8YFLogxK
U2 - 10.1145/3324884.3416582
DO - 10.1145/3324884.3416582
M3 - Conference article published in proceeding or book
AN - SCOPUS:85099211873
T3 - Proceedings - 2020 35th IEEE/ACM International Conference on Automated Software Engineering, ASE 2020
SP - 919
EP - 930
BT - Proceedings - 2020 35th IEEE/ACM International Conference on Automated Software Engineering, ASE 2020
PB - Institute of Electrical and Electronics Engineers Inc.
T2 - 35th IEEE/ACM International Conference on Automated Software Engineering, ASE 2020
Y2 - 22 September 2020 through 25 September 2020
ER -