TY - GEN
T1 - Influence of noise on transfer learning in Chinese sentiment classification using GRU
AU - Dai, Mingjun
AU - Huang, Shansong
AU - Zhong, Junpei
AU - Yang, Chenguang
AU - Yang, Shiwei
N1 - Funding Information:
MJ and SS were supported by research grant from Natural Science Foundation of China (61301182), from Specialized Research Fund for the Doctoral Program of Higher Education from The Ministry of Education (20134408120004), the Key Project of Department of Education of Guangdong Province (2015KTSCX121), from Foundation of Shenzhen City (KQCX20140509172609163), and from Natural Science Foundation of Shenzhen University (00002501). SW was partially supported by research grant from Natural Science Foundation of China (71673062)
Publisher Copyright:
© 2017 IEEE.
PY - 2018/6/21
Y1 - 2018/6/21
N2 - Sentiment classification for product reviews is of great significance for business feedback for manufactures, sellers and users. However, since a large amount of training data for a specific product domain is not always available, transfer learning is often utilized to do sentiment analysis applications. Specifically, after a pre-training of the large Chinese corpus by a word-embedding method, a larger size of training data for a specific domain was trained using a Gated Recurrent Unit. And then the trained model was used for testing the sentiment classification for a smaller amount of product reviews. The performances of this transfer learning method was also examined, especially to testify different factors affecting the performance of the transfer learning. The experimental results showed that different wording in the review domain (which we call it 'noise') will have a greater impact on transfer learning. We also calculate the difference of the wording to verify our hypothesis. According to these results, we have explored the impacts of the dataset wording, while we are doing Chinese text sentiment classification. We also shed a light in optimizing the transfer learning effect in general.
AB - Sentiment classification for product reviews is of great significance for business feedback for manufactures, sellers and users. However, since a large amount of training data for a specific product domain is not always available, transfer learning is often utilized to do sentiment analysis applications. Specifically, after a pre-training of the large Chinese corpus by a word-embedding method, a larger size of training data for a specific domain was trained using a Gated Recurrent Unit. And then the trained model was used for testing the sentiment classification for a smaller amount of product reviews. The performances of this transfer learning method was also examined, especially to testify different factors affecting the performance of the transfer learning. The experimental results showed that different wording in the review domain (which we call it 'noise') will have a greater impact on transfer learning. We also calculate the difference of the wording to verify our hypothesis. According to these results, we have explored the impacts of the dataset wording, while we are doing Chinese text sentiment classification. We also shed a light in optimizing the transfer learning effect in general.
KW - Gated Recurrent Unit
KW - neural network
KW - sentiment classification
KW - transfer learning
UR - http://www.scopus.com/inward/record.url?scp=85050194629&partnerID=8YFLogxK
U2 - 10.1109/FSKD.2017.8393047
DO - 10.1109/FSKD.2017.8393047
M3 - Conference article published in proceeding or book
AN - SCOPUS:85050194629
T3 - ICNC-FSKD 2017 - 13th International Conference on Natural Computation, Fuzzy Systems and Knowledge Discovery
SP - 1844
EP - 1849
BT - ICNC-FSKD 2017 - 13th International Conference on Natural Computation, Fuzzy Systems and Knowledge Discovery
A2 - Zhao, Liang
A2 - Wang, Lipo
A2 - Cai, Guoyong
A2 - Li, Kenli
A2 - Liu, Yong
A2 - Xiao, Guoqing
PB - Institute of Electrical and Electronics Engineers Inc.
T2 - 13th International Conference on Natural Computation, Fuzzy Systems and Knowledge Discovery, ICNC-FSKD 2017
Y2 - 29 July 2017 through 31 July 2017
ER -