Abstract
This article introduces basic concepts of a modern linguistic corpus and corpus linguistics. A corpus is defined as a collection of examples of language in use that are selected and compiled in a principled way and corpus linguistics as linguistic studies of such corpora. We explicate classification, basic procedures of data collection, construction, and annotation of corpora. Representative research areas and applications where corpus and corpus-based analysis play crucial roles are also introduced. Finally, trends and future directions of development of corpus linguistics are discussed.
Original language | English |
---|---|
Title of host publication | International Encyclopedia of the Social & Behavioral Sciences: Second Edition |
Publisher | Elsevier Inc. |
Pages | 949-953 |
Number of pages | 5 |
ISBN (Electronic) | 9780080970875 |
ISBN (Print) | 9780080970868 |
DOIs | |
Publication status | Published - 26 Mar 2015 |
Keywords
- Annotation
- Balanced corpus
- Comparable corpus
- Computational linguistics
- Corpus
- Crowdsourcing
- Language resources
- Language technology
- Natural language processing
- Parallel corpus
- Tagging
- Web as corpus
ASJC Scopus subject areas
- General Social Sciences