TY - GEN
T1 - A user segmentation approach for ugc platform based on a new lead user identification index system and K-means clustering
AU - Chang, D.
AU - Zhao, J.
AU - Zou, F.
AU - Xu, G.
N1 - Funding Information:
This study was supported by the National Natural Science Foundation of China (Grant No. 71802132).
Publisher Copyright:
© 2020 IEEE.
PY - 2020/12/14
Y1 - 2020/12/14
N2 - Nowadays, user-generated content (UGC) has become an important part of Internet user data. This study aims to develop an innovative user identification approach based on UGC platforms. To achieve the objective, this research proposed i) a web mining process to crawl UGC data; ii) a lead user identification index system for evaluating the innovation capability of users; and iii) a user classification process based on K-means clustering according to their UGC performance. Particularly, the complete user performance data of more than 100 users on Douban (one of the biggest UGC platforms in China) were collected, and the web mining, factor analysis, and clustering algorithm was integrated to process the data and classify user groups according to their UGC performance. The classification results were verified through incorporating expertise, and it showed that the classification can exactly recognize the users with proper lead userness. This research is expected to help small and medium enterprises without powerful big data ability to identify innovative users and valuable UGC data more efficiently and facilitate the further product improvement.
AB - Nowadays, user-generated content (UGC) has become an important part of Internet user data. This study aims to develop an innovative user identification approach based on UGC platforms. To achieve the objective, this research proposed i) a web mining process to crawl UGC data; ii) a lead user identification index system for evaluating the innovation capability of users; and iii) a user classification process based on K-means clustering according to their UGC performance. Particularly, the complete user performance data of more than 100 users on Douban (one of the biggest UGC platforms in China) were collected, and the web mining, factor analysis, and clustering algorithm was integrated to process the data and classify user groups according to their UGC performance. The classification results were verified through incorporating expertise, and it showed that the classification can exactly recognize the users with proper lead userness. This research is expected to help small and medium enterprises without powerful big data ability to identify innovative users and valuable UGC data more efficiently and facilitate the further product improvement.
KW - Factor analysis
KW - Innovative users
KW - K-means clustering
KW - Lead user identification
KW - UGC
UR - http://www.scopus.com/inward/record.url?scp=85099761354&partnerID=8YFLogxK
U2 - 10.1109/IEEM45057.2020.9309940
DO - 10.1109/IEEM45057.2020.9309940
M3 - Conference article published in proceeding or book
AN - SCOPUS:85099761354
T3 - IEEE International Conference on Industrial Engineering and Engineering Management
SP - 954
EP - 958
BT - 2020 IEEE International Conference on Industrial Engineering and Engineering Management, IEEM 2020
PB - IEEE Computer Society
T2 - 2020 IEEE International Conference on Industrial Engineering and Engineering Management, IEEM 2020
Y2 - 14 December 2020 through 17 December 2020
ER -