An algorithm combining statistics-based and rules-based for chunk identification of Chinese sentences

Wang Rongbo, Zheru Chi

Research output: Chapter in book / Conference proceedingConference article published in proceeding or bookAcademic researchpeer-review

Abstract

Natural language processing (NLP) is a very hot research domain. One important branch of it is sentence analysis, including Chinese sentence analysis. However, currently, no mature deep analysis theories and techniques are available. An alternative way is to perform shallow parsing on sentences which is very popular in the domain. The chunk identification is a fundamental task for shallow parsing. The purpose of this paper is to characterize a chunk boundary parsing algorithm, using a statistical method combining adjustment rules, which serves as a supplement to traditional statistics-based parsing methods. The experimental results show that the model works well on the small dataset. It will contribute to the sequent processes like chunk tagging and chunk collocation extraction under other topics etc.
Original languageEnglish
Title of host publicationPACLIC 20 - Proceedings of the 20th Pacific Asia Conference on Language, Information and Computation
Pages446-451
Number of pages6
Publication statusPublished - 1 Dec 2006
Event20th Pacific Asia Conference on Language, Information and Computation, PACLIC 20 - Wuhan, China
Duration: 1 Nov 20063 Nov 2006

Conference

Conference20th Pacific Asia Conference on Language, Information and Computation, PACLIC 20
CountryChina
CityWuhan
Period1/11/063/11/06

ASJC Scopus subject areas

  • Language and Linguistics
  • Computer Science (miscellaneous)

Cite this