TY - JOUR
T1 - A Feature Engineering and Ensemble Learning Based Approach for Repeated Buyers Prediction
AU - Zhang, M.
AU - Lu, J.
AU - Ma, N.
AU - Cheng, T. C.E.
AU - Hua, G.
N1 - Funding Information:
Supported by Natural Science Foundation of China(71901027); Beijing Forestry University 2021 Course Ideological and Political Teaching, Research and Teaching Reform Project, "Management Model and Basic Decision-making" project(2021KCSZXY010)
Publisher Copyright:
© 2021 by the authors. Licensee Agora University, Oradea, Romania. This is an open access article distributed under the terms and conditions of the Creative Commons Attribution-NonCommercial 4.0 International License.
PY - 2022
Y1 - 2022
N2 - The global e-commerce market is growing at a rapid pace, but the percentage of repeat buyers is low. According to Tmall, the repurchase rate is only 6.1%, while research shows that a 5% increase in the repurchase rate can lead to a 25% to 95% increase in profit. To increase the repurchase rate, merchants need to predict potential repeat buyers and convert them into repurchasers. Therefore, it is necessary to predict repeat buyers. In this paper we build a prediction model of repeat purchasers using Tmall’s dataset. First, we build high-quality feature engineering for e-commerce scenarios by manual construction and algorithmic selection. We introduce the synthetic minority oversampling technique (SMOTE) algorithm to solve the data imbalance problem and improve prediction performance. Then we train classical classifiers including factorization machine and logistic regression, and ensemble learning classifiers including extreme gradient boosting, and light gradient boosting machine machines. Finally, we construct a two-layer fusion model based on the Stacking algorithm to further enhance prediction performance. The results show that through a series of innovations such as data imbalance processing, feature engineering, and fusion models, the model area under curve (AUC) value is improved by 0.01161. Our findings provide important implications for managing e-commerce platforms and the platform merchants.
AB - The global e-commerce market is growing at a rapid pace, but the percentage of repeat buyers is low. According to Tmall, the repurchase rate is only 6.1%, while research shows that a 5% increase in the repurchase rate can lead to a 25% to 95% increase in profit. To increase the repurchase rate, merchants need to predict potential repeat buyers and convert them into repurchasers. Therefore, it is necessary to predict repeat buyers. In this paper we build a prediction model of repeat purchasers using Tmall’s dataset. First, we build high-quality feature engineering for e-commerce scenarios by manual construction and algorithmic selection. We introduce the synthetic minority oversampling technique (SMOTE) algorithm to solve the data imbalance problem and improve prediction performance. Then we train classical classifiers including factorization machine and logistic regression, and ensemble learning classifiers including extreme gradient boosting, and light gradient boosting machine machines. Finally, we construct a two-layer fusion model based on the Stacking algorithm to further enhance prediction performance. The results show that through a series of innovations such as data imbalance processing, feature engineering, and fusion models, the model area under curve (AUC) value is improved by 0.01161. Our findings provide important implications for managing e-commerce platforms and the platform merchants.
KW - Ensemble learning
KW - Feature engineering
KW - Fusion model
KW - Repeat buyer prediction
UR - http://www.scopus.com/inward/record.url?scp=85144952380&partnerID=8YFLogxK
U2 - 10.15837/ijccc.2022.6.4988
DO - 10.15837/ijccc.2022.6.4988
M3 - Journal article
AN - SCOPUS:85144952380
SN - 1841-9836
VL - 17
JO - International Journal of Computers, Communications and Control
JF - International Journal of Computers, Communications and Control
IS - 6
M1 - 4988
ER -