Abstract
As an important linguistic category, terms of address not only carry particular information, but also express feelings and emotions. They are therefore widely used in literature works. In natural language processing (NLP) and its applications, terms of address are one of the key elements in named entity recognition, which can affect the overall performance of an NLP system. Based on the analysis of a manually-annotated corpus of four Chinese classical novels in the Ming and Qing dynasties, this paper presents a classification and annotation system for personal names and terms of address from the perspective of named entity recognition and information extraction in NLP. Personal names and terms of address are categorized into simple types and compound types and the compound-type is further categorized into four subtypes, namely, fixed expressions, appositive constructions, subordinate constructions of affiliation, and other subordinate constructions.
Original language | English |
---|---|
Pages (from-to) | 10-20 |
Number of pages | 11 |
Journal | International journal of knowledge and language processing |
Volume | 4 |
Issue number | 4 |
Publication status | Published - 2013 |
Keywords
- Terms of address
- NLP
- Novels in the Ming and Qing dynasties
- Corpus
- Named entity recognition