Building a Chinese shallow parsed treebank for collocation extraction

Baoli Li, Lu Qin, Li Yin

Research output: Chapter in book / Conference proceedingConference article published in proceeding or bookAcademic researchpeer-review

2 Citations (Scopus)

Abstract

To automatically extract Chinese collocations and build a large-scale collocation bank, we are developing a one-million-word Chinese shallow parsed treebank. The treebank can be used not only as a training set for our shallow parser, but also as processed data from which collocations are extracted. This paper presents several issues related to this on-going project, such as our definition of shallow parsing used in Chinese collocation extraction, guideline preparation, and quality control.
Original languageEnglish
Title of host publicationLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
PublisherSpringer Verlag
Pages402-405
Number of pages4
ISBN (Print)3540005323
DOIs
Publication statusPublished - 1 Jan 2003
Event4th International Conference on Intelligent Text Processing and Computational Linguistics, CICLing 2003 - Mexico City, Mexico
Duration: 16 Feb 200322 Feb 2003

Publication series

NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volume2588
ISSN (Print)0302-9743
ISSN (Electronic)1611-3349

Conference

Conference4th International Conference on Intelligent Text Processing and Computational Linguistics, CICLing 2003
Country/TerritoryMexico
CityMexico City
Period16/02/0322/02/03

ASJC Scopus subject areas

  • Theoretical Computer Science
  • Computer Science(all)

Cite this