Abstract
The work presented in this paper is motivated by the practical need for content extraction, and the available data source and evaluation benchmark from the ACE program. The Chinese entity detection and tracking task is of particular interest to us. A novel solution is proposed to alleviate the language-independent and language-dependent problems special in this task. Mention detection takes advantages of machine learning approaches and character-based models. It manipulates different types of entities being mentioned and different constitution units (i.e., extents and heads) separately. Mentions referring to the same entity are linked together by integrating most-specific-first and closest-first rule based pairwise clustering algorithms. Types of mentions and entities are determined by head-driven classification approaches. The implemented system achieves 66.1 of ACE value, which has been one of the top-tier results.
Original language | English |
---|---|
Pages (from-to) | 219-236 |
Number of pages | 18 |
Journal | International journal of computer processing of languages |
Volume | 20 |
Issue number | 4 |
DOIs | |
Publication status | Published - 2007 |
Keywords
- Chinese entity
- Entity detection
- Mention categorization
- Mention clustering