The creation of a prosodically transcribed intercultural corpus : The Hong Kong Corpus of Spoken English (prosodic)

Wai Lin Leung, C. Greaves, Martin John Warren

Research output: Journal article publicationJournal articleAcademic researchpeer-review


This paper describes a new addition to the growing number of spoken corpora, the Hong Kong Corpus of Spoken English (prosodic), which has the relatively rare and additional benefit of being both orthographically and prosodically transcribed. The corpus comprises approximately one-million words spread evenly across four sub-corpora: academic discourses, business discourses, conversations, and public discourses. The corpus described in this paper consists of just over half of the full Hong Kong Corpus of Spoken English (orthographic), which is a two-million word corpus of naturally occurring talk between Hong Kong Chinese and speakers of languages other than Cantonese. This paper describes the contents of the HKCSE (prosodic), the discourse intonation systems (Brazil 1997) used to denote speakers’ intonation choices, and the software specifically designed and implemented to interrogate the corpus, together with examples of some of the search functions available to the user.
Original languageEnglish
Pages (from-to)47-68
Number of pages22
JournalICAME journal
Issue number29
Publication statusPublished - 2005

Cite this