TY - GEN
T1 - Efficiently Answering k-hop Reachability Queries in Large Dynamic Graphs for Fraud Feature Extraction
AU - Xu, Zequan
AU - Luo, Siqiang
AU - Shi, Jieming
AU - Li, Hui
AU - Lin, Chen
AU - Sun, Qihang
AU - Hu, Shaofeng
N1 - Funding Information:
This work was supported by the National Natural Science Foundation of China (No. 62002303), the Natural Science Foundation of Fujian Province of China (No. 2020J05001), the China Fundamental Research Funds for the Central Universities (No. 20720210098), and 2021 Tencent Wechat Rhino-Bird Focused Research Program.
Publisher Copyright:
© 2022 IEEE.
PY - 2022/8
Y1 - 2022/8
N2 - Instant messaging client (IMC) is now an essential tool for mobile users. In the representative IMC We Chat, cybercriminals deceive frauds, causing financial loss to normal users. Through statistical analysis, we find that certain fraud interactions commonly occur among WeChat users who are not k-hop neighbors. Therefore, efficiently answering whether the distance between two vertices is not longer than k at a certain time point (i.e., k-hop reachability queries) over the dynamic social graph of WeChat becomes a crucial task for fraud feature extraction in the detection system: it can help human experts quickly identify suspicious user interactions and the query results can be further used as the input feature to the downstream machine learning based detection methods. In this paper, we illustrate Bidirectional k-hop Reachability Query Processing over a Dynamic Graph (BREAD) that is used in WeChat for extracting the k-hop reachability feature for fraud detection. BREAD adopts the idea of estimating Personalized PageRank value. It first conducts the backward search from the destination vertex to construct an intermediate vertex set. Then, it performs a certain amount of random walks from the start vertex to see whether they can hit the intermediate vertex set, and the results are returned to answer k-hop reachability queries. We further propose text{BREAD}++ that leverages the massive parallel processing power of GPU to achieve a considerable performance gain. Experiments on several large-scale dynamic graph benchmarks and the social graph of WeChat have demonstrated that text{BREAD}/text{BREAD}++ is superior than existing index-free competitors: our methods provide not only fast but also accurate responses and they are of practical value to k-hop reachability feature extraction in the fraud detection system of WeChat. Our implementation is available at https://github.com/XMUDM/BREAD.
AB - Instant messaging client (IMC) is now an essential tool for mobile users. In the representative IMC We Chat, cybercriminals deceive frauds, causing financial loss to normal users. Through statistical analysis, we find that certain fraud interactions commonly occur among WeChat users who are not k-hop neighbors. Therefore, efficiently answering whether the distance between two vertices is not longer than k at a certain time point (i.e., k-hop reachability queries) over the dynamic social graph of WeChat becomes a crucial task for fraud feature extraction in the detection system: it can help human experts quickly identify suspicious user interactions and the query results can be further used as the input feature to the downstream machine learning based detection methods. In this paper, we illustrate Bidirectional k-hop Reachability Query Processing over a Dynamic Graph (BREAD) that is used in WeChat for extracting the k-hop reachability feature for fraud detection. BREAD adopts the idea of estimating Personalized PageRank value. It first conducts the backward search from the destination vertex to construct an intermediate vertex set. Then, it performs a certain amount of random walks from the start vertex to see whether they can hit the intermediate vertex set, and the results are returned to answer k-hop reachability queries. We further propose text{BREAD}++ that leverages the massive parallel processing power of GPU to achieve a considerable performance gain. Experiments on several large-scale dynamic graph benchmarks and the social graph of WeChat have demonstrated that text{BREAD}/text{BREAD}++ is superior than existing index-free competitors: our methods provide not only fast but also accurate responses and they are of practical value to k-hop reachability feature extraction in the fraud detection system of WeChat. Our implementation is available at https://github.com/XMUDM/BREAD.
KW - fraud detection
KW - k-hop reachability
KW - personalized pagerank
UR - http://www.scopus.com/inward/record.url?scp=85137560279&partnerID=8YFLogxK
U2 - 10.1109/MDM55031.2022.00053
DO - 10.1109/MDM55031.2022.00053
M3 - Conference article published in proceeding or book
AN - SCOPUS:85137560279
T3 - Proceedings - IEEE International Conference on Mobile Data Management
SP - 238
EP - 245
BT - Proceedings - 2022 23rd IEEE International Conference on Mobile Data Management, MDM 2022
PB - Institute of Electrical and Electronics Engineers Inc.
T2 - 23rd IEEE International Conference on Mobile Data Management, MDM 2022
Y2 - 6 June 2022 through 9 June 2022
ER -