TY - JOUR
T1 - Analyzing imbalanced online consumer review data in product design using geometric semantic genetic programming
AU - Chan, Kit Yan
AU - Kwong, C. K.
AU - Jiang, Huimin
N1 - Funding Information:
The work described in this paper was supported by a grant from The Hong Kong Polytechnic University (Project No. G-UADL ). This work was also partially supported by a grant from National Natural Science Foundation of China (grant number 71901149) .
Publisher Copyright:
© 2021 Elsevier Ltd
PY - 2021/10
Y1 - 2021/10
N2 - To develop a successful product, understanding the relationship between customer satisfaction (CS) and design attributes of a new product is essential. Nowadays IoT technologies are used to collect online review data from social media. More representative CS models are developed using online review data. However, online review data is imbalanced, since popular products receive more online consumer reviews and unpopular products receive less. When imbalanced data is used, CS models learn the characteristics of majority data while rarely learning minority data. Misleading analysis for product development is made since the CS model is biased to popular products. This paper proposes an approach to generate nondominated CS models which learn equally to imbalanced data from popular and unpopular products. A multi-objective optimization problem is formulated to learn equally in imbalanced data. This problem is proposed to be solved by the geometric semantic genetic programming (GSGP); a Pareto set of nondominated CS models is generated by the GSGP. Product designers select the most preferred models in the Pareto set. The preferred nondominated CS model attempts to tradeoff unpopular and popular products, to determine optimal design attributes and maximize the CS. The case study shows that the proposed GSGP is able to generate CS models with more accurate CS predictions compared to the commonly used methods. The proposed GSGP also generates a Pareto set of nondominated CS models which equally learn consumer reviews for those dryers. Based on the Pareto set, the design team selects the most preferred CS model.
AB - To develop a successful product, understanding the relationship between customer satisfaction (CS) and design attributes of a new product is essential. Nowadays IoT technologies are used to collect online review data from social media. More representative CS models are developed using online review data. However, online review data is imbalanced, since popular products receive more online consumer reviews and unpopular products receive less. When imbalanced data is used, CS models learn the characteristics of majority data while rarely learning minority data. Misleading analysis for product development is made since the CS model is biased to popular products. This paper proposes an approach to generate nondominated CS models which learn equally to imbalanced data from popular and unpopular products. A multi-objective optimization problem is formulated to learn equally in imbalanced data. This problem is proposed to be solved by the geometric semantic genetic programming (GSGP); a Pareto set of nondominated CS models is generated by the GSGP. Product designers select the most preferred models in the Pareto set. The preferred nondominated CS model attempts to tradeoff unpopular and popular products, to determine optimal design attributes and maximize the CS. The case study shows that the proposed GSGP is able to generate CS models with more accurate CS predictions compared to the commonly used methods. The proposed GSGP also generates a Pareto set of nondominated CS models which equally learn consumer reviews for those dryers. Based on the Pareto set, the design team selects the most preferred CS model.
KW - Genetic programming
KW - Imbalanced data mining
KW - Multi-objective optimization
KW - New product development
KW - Online customer reviews
KW - Social media
UR - http://www.scopus.com/inward/record.url?scp=85114009289&partnerID=8YFLogxK
U2 - 10.1016/j.engappai.2021.104442
DO - 10.1016/j.engappai.2021.104442
M3 - Journal article
AN - SCOPUS:85114009289
SN - 0952-1976
VL - 105
JO - Engineering Applications of Artificial Intelligence
JF - Engineering Applications of Artificial Intelligence
M1 - 104442
ER -