Sentiment classification of online Cantonese reviews by supervised machine learning approaches

Ziqiong Zhang, Qiang Ye, Yijun Li, Chun Hung Roberts Law

Research output: Journal article publicationJournal articleAcademic researchpeer-review

12 Citations (Scopus)

Abstract

Cantonese is an important Chinese dialect spoken in some regions of Southern China. Local online users often represent their opinions and experiences with written Cantonese on the web. With two supervised machine learning approaches, this paper conducts a series of experiments to explore appropriate methods for automatic sentiment classification in the very noisy domain of online Cantonese-written reviews. Findings indicate that the support vector machine classifier based on a Mandarin Chinese word segmentation tool performs surprisingly well. The accuracy, precision and recall respectively for positive and negative reviews all reach above 85% when the training corpus contains 5,000 or more reviews.
Original languageEnglish
Pages (from-to)382-397
Number of pages16
JournalInternational Journal of Web Engineering and Technology
Volume5
Issue number4
DOIs
Publication statusPublished - 1 Mar 2009

Keywords

  • Cantonese
  • Online reviews
  • Sentiment classification
  • Text mining

ASJC Scopus subject areas

  • Information Systems
  • Hardware and Architecture
  • Computer Networks and Communications

Cite this