TY - GEN
T1 - Morpheme inversion in disyllabic compounds—cases in chinese diachronic corpora
AU - Xiong, Dan
AU - Lu, Qin
AU - Lo, Fengju
AU - Shi, Dingxu
AU - Chiu, Tin Shing
PY - 2015/1/1
Y1 - 2015/1/1
N2 - Morpheme inversion is a significant lexical phenomenon in the evolution of Chinese words, and it poses additional difficulties in Chinese word segmentation, especially in computer processing of Chinese classics. This paper reports a study on the disyllabic morpheme-inverted compounds in the Chinese diachronic corpora from the perspective of natural language processing. The corpora include two pre-Qin classics and four notable novels created in the Ming and Qing dynasties, in which words are segmented and proper nouns are tagged. Based on the full statistics and analysis, a comparative study is carried out on the use of disyllabic morpheme-inverted compounds in the two types of Chinese text, that is, historical works and fictions. Results show that there are many more morpheme-inverted compounds in the Ming-Qing novels than in the pre-Qin classics in terms of both lexical item and frequency. The morpheme-inverted compounds in the Ming-Qing novels are also closer to their modern counterparts.
AB - Morpheme inversion is a significant lexical phenomenon in the evolution of Chinese words, and it poses additional difficulties in Chinese word segmentation, especially in computer processing of Chinese classics. This paper reports a study on the disyllabic morpheme-inverted compounds in the Chinese diachronic corpora from the perspective of natural language processing. The corpora include two pre-Qin classics and four notable novels created in the Ming and Qing dynasties, in which words are segmented and proper nouns are tagged. Based on the full statistics and analysis, a comparative study is carried out on the use of disyllabic morpheme-inverted compounds in the two types of Chinese text, that is, historical works and fictions. Results show that there are many more morpheme-inverted compounds in the Ming-Qing novels than in the pre-Qin classics in terms of both lexical item and frequency. The morpheme-inverted compounds in the Ming-Qing novels are also closer to their modern counterparts.
KW - Chinese diachronic corpora
KW - Chinese segmentation
KW - Disyllabic morpheme-inverted compounds
UR - http://www.scopus.com/inward/record.url?scp=84956991461&partnerID=8YFLogxK
U2 - 10.1007/978-3-319-27194-1_51
DO - 10.1007/978-3-319-27194-1_51
M3 - Conference article published in proceeding or book
SN - 9783319271934
T3 - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
SP - 504
EP - 515
BT - Chinese Lexical Semantics - 16th Workshop, CLSW 2015, Revised Selected Papers
PB - Springer Verlag
T2 - 16th Workshop on Chinese Lexical Semantics Workshop, CLSW 2015
Y2 - 9 May 2015 through 11 May 2015
ER -