Analyzing imbalanced online consumer review data in product design using geometric semantic genetic programming

Kit Yan Chan, C. K. Kwong, Huimin Jiang

Research output: Journal article publicationJournal articleAcademic researchpeer-review

6 Citations (Scopus)


To develop a successful product, understanding the relationship between customer satisfaction (CS) and design attributes of a new product is essential. Nowadays IoT technologies are used to collect online review data from social media. More representative CS models are developed using online review data. However, online review data is imbalanced, since popular products receive more online consumer reviews and unpopular products receive less. When imbalanced data is used, CS models learn the characteristics of majority data while rarely learning minority data. Misleading analysis for product development is made since the CS model is biased to popular products. This paper proposes an approach to generate nondominated CS models which learn equally to imbalanced data from popular and unpopular products. A multi-objective optimization problem is formulated to learn equally in imbalanced data. This problem is proposed to be solved by the geometric semantic genetic programming (GSGP); a Pareto set of nondominated CS models is generated by the GSGP. Product designers select the most preferred models in the Pareto set. The preferred nondominated CS model attempts to tradeoff unpopular and popular products, to determine optimal design attributes and maximize the CS. The case study shows that the proposed GSGP is able to generate CS models with more accurate CS predictions compared to the commonly used methods. The proposed GSGP also generates a Pareto set of nondominated CS models which equally learn consumer reviews for those dryers. Based on the Pareto set, the design team selects the most preferred CS model.

Original languageEnglish
Article number104442
JournalEngineering Applications of Artificial Intelligence
Publication statusPublished - Oct 2021


  • Genetic programming
  • Imbalanced data mining
  • Multi-objective optimization
  • New product development
  • Online customer reviews
  • Social media

ASJC Scopus subject areas

  • Control and Systems Engineering
  • Artificial Intelligence
  • Electrical and Electronic Engineering


Dive into the research topics of 'Analyzing imbalanced online consumer review data in product design using geometric semantic genetic programming'. Together they form a unique fingerprint.

Cite this