TY - GEN
T1 - Examiner: Automatically Locating Inconsistent Instructions between Real Devices and CPU Emulators for ARM
AU - Jiang, Muhui
AU - Xu, Tianyi
AU - Zhou, Yajin
AU - Hu, Yufeng
AU - Zhong, Ming
AU - Wu, Lei
AU - Luo, Xiapu
AU - Ren, Kui
N1 - Funding Information:
We would like to thank the anonymous reviewers for their comments that greatly helped improve the presentation of this paper. We also want to thank Dr. Manuel Rigger for shepherding our paper. This work was partially supported by the National Natural Science Foundation of China (NSFC) under Grant 61872438, Leading Innovative and Entrepreneur Team Introduction Program of Zhejiang (2018R01005), the Fundamental Research Funds for the Central Universities (Zhejiang University NGICS Platform), HK RGC Project (No. PolyU 152239/18E).
Publisher Copyright:
© 2022 ACM.
PY - 2022/2/22
Y1 - 2022/2/22
N2 - Emulators are widely used to build dynamic analysis frameworks due to its fine-grained tracing capability, full system monitoring functionality, and scalability of running on different operating systems and architectures. However, whether emulators are consistent with real devices is unknown. To understand this problem, we aim to automatically locate inconsistent instructions, which behave differently between emulators and real devices. We target the ARM architecture, which provides machine-readable specifications. Based on the specification, we propose a sufficient test case generator by designing and implementing the first symbolic execution engine for the ARM architecture specification language (ASL). We generate 2,774,649 representative instruction streams and conduct differential testing between four ARM real devices in different architecture versions (i.e., ARMv5, ARMv6, ARMv7, and ARMv8) and three state-of-The-Art emulators (i.e., QEMU, Unicorn, and Angr). We locate a huge number of inconsistent instruction streams (171,858 for QEMU, 223,264 for unicorn, and 120,169 for Angr). We find that undefined implementation in ARM manual and bugs of emulators are the major causes of inconsistencies. Furthermore, we discover 12 bugs, which influence commonly used instructions (e.g., BLX). With the inconsistent instructions, we build three security applications and demonstrate the capability of these instructions on detecting emulators, anti-emulation, and anti-fuzzing.
AB - Emulators are widely used to build dynamic analysis frameworks due to its fine-grained tracing capability, full system monitoring functionality, and scalability of running on different operating systems and architectures. However, whether emulators are consistent with real devices is unknown. To understand this problem, we aim to automatically locate inconsistent instructions, which behave differently between emulators and real devices. We target the ARM architecture, which provides machine-readable specifications. Based on the specification, we propose a sufficient test case generator by designing and implementing the first symbolic execution engine for the ARM architecture specification language (ASL). We generate 2,774,649 representative instruction streams and conduct differential testing between four ARM real devices in different architecture versions (i.e., ARMv5, ARMv6, ARMv7, and ARMv8) and three state-of-The-Art emulators (i.e., QEMU, Unicorn, and Angr). We locate a huge number of inconsistent instruction streams (171,858 for QEMU, 223,264 for unicorn, and 120,169 for Angr). We find that undefined implementation in ARM manual and bugs of emulators are the major causes of inconsistencies. Furthermore, we discover 12 bugs, which influence commonly used instructions (e.g., BLX). With the inconsistent instructions, we build three security applications and demonstrate the capability of these instructions on detecting emulators, anti-emulation, and anti-fuzzing.
KW - Differential Testing
KW - Emulator
KW - Inconsistent Instructions
UR - http://www.scopus.com/inward/record.url?scp=85126393364&partnerID=8YFLogxK
U2 - 10.1145/3503222.3507736
DO - 10.1145/3503222.3507736
M3 - Conference article published in proceeding or book
AN - SCOPUS:85126393364
T3 - International Conference on Architectural Support for Programming Languages and Operating Systems - ASPLOS
SP - 846
EP - 858
BT - ASPLOS 2022 - Proceedings of the 27th ACM International Conference on Architectural Support for Programming Languages and Operating Systems
A2 - Falsafi, Babak
A2 - Ferdman, Michael
A2 - Lu, Shan
A2 - Wenisch, Thomas F.
PB - Association for Computing Machinery
T2 - 27th ACM International Conference on Architectural Support for Programming Languages and Operating Systems, ASPLOS 2022
Y2 - 28 February 2022 through 4 March 2022
ER -