Abstract
We introduce a conservative error correcting model, Stacked TBL, that is designed to improve the performance of even high-performing models like boosting, with little risk of accidentally degrading performance. Stacked TBL is particularly well suited for corpus-based natural language applications involving high-dimensional feature spaces, since it leverages the characteristics of the TBL paradigm that we appropriate. We consider here the task of automatically annonating named entities in text corpora. The task does pose a number of challenges for TBL, to which there are some simple yet effective solutions. We discuss the empirical behavior of Stacked T BL, and consider evidence that despite its simplicity, more complex and time-consuming variants are not generally required.
Original language | English |
---|---|
Title of host publication | Proceedings of the 4th International Conference on Language Resources and Evaluation, LREC 2004 |
Publisher | European Language Resources Association (ELRA) |
Pages | 21-24 |
Number of pages | 4 |
ISBN (Electronic) | 2951740816, 9782951740815 |
Publication status | Published - 1 Jan 2004 |
Event | 4th International Conference on Language Resources and Evaluation, LREC 2004 - Lisbon, Portugal Duration: 26 May 2004 → 28 May 2004 |
Conference
Conference | 4th International Conference on Language Resources and Evaluation, LREC 2004 |
---|---|
Country/Territory | Portugal |
City | Lisbon |
Period | 26/05/04 → 28/05/04 |
ASJC Scopus subject areas
- Library and Information Sciences
- Education
- Language and Linguistics
- Linguistics and Language