Online reviews written in Cantonese style are widely utilized bynative Cantonese speakers and a large amount of Cantonese re-views are available on the Internet. However, only few studies onCantonese sentiment analysis are reported as there is a seriouslack of resources including annotated corpora and adequate lexicalcollections. In this work, we present a novel approach for senti-ment analysis of Cantonese style text by incorporating sentimentknowledge into the attention mechanism in the state-of-the-artdeep learning based Long Short-Term Memory network, referredto as the sentiment augmented attention (LSTM-SAT). A restaurantreview dataset is first collected from Openrice, a popular restaurantreview website mostly written in Cantonese style with naturallyannotated rating labels. We then extract a Cantonese sentimentlexicon based on an automatic construction method to obtain boththe sentiment terms and their polarities using sentiment scoresof the review text. The automatically obtained terms can then beused to augment a manually obtained small Cantonese sentimentlexicon. Furthermore, we propose a novel method to incorporatelexical knowledge in the sentiment lexicon to the attention layeras the prior knowledge in an LSTM model to further highlight theimportance of sentiment words. Experimental results show thatour automatically constructed Cantonese sentiment lexicon helpsimprove coverage and this type of sentiment knowledge can be asemantically meaningful information in deep learning models. Thisinformation indeed serves as effective information as our proposedLSTM-SAT shows a significant improvement on the performanceof sentiment classification.
|Title of host publication||Proceedings of the 8th KDD Workshop on Issues of Sentiment Discovery and Opinion Mining(WISDOM)|
|Place of Publication||Anchorage, Alaska|
|Number of pages||9|
|Publication status||Published - 4 Aug 2019|