Automatic discovery of named entity variants – Grammar-driven approaches to non-alphabetical transliterations

Chu Ren Huang, Petr Šimon, Shu Kai Hsieh

Research output: Journal article publicationConference articleAcademic researchpeer-review

1 Citation (Scopus)

Abstract

Identification of transliterated names is a particularly difficult task of Named Entity Recognition (NER), especially in the Chinese context. Of all possible variations of transliterated named entities, the difference between PRC and Taiwan is the most prevalent and most challenging. In this paper, we introduce a novel approach to the automatic extraction of diverging transliterations of foreign named entities by bootstrapping co-occurrence statistics from tagged and segmented Chinese corpus. Preliminary experiment yields promising results and shows its potential in NLP applications.

Original languageEnglish
Pages (from-to)153-156
Number of pages4
JournalProceedings of the Annual Meeting of the Association for Computational Linguistics
Publication statusPublished - Jun 2007
Externally publishedYes
Event45th Annual Meeting of the Association for Computational Linguistics, ACL 2007 - Prague, Czech Republic
Duration: 25 Jun 200727 Jun 2007

ASJC Scopus subject areas

  • Computer Science Applications
  • Linguistics and Language
  • Language and Linguistics

Fingerprint

Dive into the research topics of 'Automatic discovery of named entity variants – Grammar-driven approaches to non-alphabetical transliterations'. Together they form a unique fingerprint.

Cite this