Morpheme inversion in disyllabic compounds—cases in chinese diachronic corpora

Dan Xiong, Qin Lu, Fengju Lo, Dingxu Shi, Tin Shing Chiu

Research output: Chapter in book / Conference proceedingConference article published in proceeding or bookAcademic researchpeer-review

Abstract

Morpheme inversion is a significant lexical phenomenon in the evolution of Chinese words, and it poses additional difficulties in Chinese word segmentation, especially in computer processing of Chinese classics. This paper reports a study on the disyllabic morpheme-inverted compounds in the Chinese diachronic corpora from the perspective of natural language processing. The corpora include two pre-Qin classics and four notable novels created in the Ming and Qing dynasties, in which words are segmented and proper nouns are tagged. Based on the full statistics and analysis, a comparative study is carried out on the use of disyllabic morpheme-inverted compounds in the two types of Chinese text, that is, historical works and fictions. Results show that there are many more morpheme-inverted compounds in the Ming-Qing novels than in the pre-Qin classics in terms of both lexical item and frequency. The morpheme-inverted compounds in the Ming-Qing novels are also closer to their modern counterparts.
Original languageEnglish
Title of host publicationChinese Lexical Semantics - 16th Workshop, CLSW 2015, Revised Selected Papers
PublisherSpringer Verlag
Pages504-515
Number of pages12
ISBN (Print)9783319271934
DOIs
Publication statusPublished - 1 Jan 2015
Event16th Workshop on Chinese Lexical Semantics Workshop, CLSW 2015 - Beijing, China
Duration: 9 May 201511 May 2015

Publication series

NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volume9332
ISSN (Print)0302-9743
ISSN (Electronic)1611-3349

Conference

Conference16th Workshop on Chinese Lexical Semantics Workshop, CLSW 2015
CountryChina
CityBeijing
Period9/05/1511/05/15

Keywords

  • Chinese diachronic corpora
  • Chinese segmentation
  • Disyllabic morpheme-inverted compounds

ASJC Scopus subject areas

  • Theoretical Computer Science
  • Computer Science(all)

Cite this