A quantitative model of talker normalization by native and non-native speakers

Si Chen, Caicai Zhang, Puiyin Lau, Yike Yang, Bei Li, Wing Kam Fung

Research output: Unpublished conference presentation (presented paper, abstract, poster)AbstractAcademic researchpeer-review


Talker variability affects native and non-native speakers in speech perception of segmentals (e.g., Bent et al., 2010) and suprasegmentals (e.g., Wong and Diehl, 2003). Zhang and Chen (2016) reported that gender-specific F0 range may contribute significantly to Cantonese tone perception. However, a full understanding of how the population F0 distribution affects tone identification is missing. This study aims to bridge this gap by modelling tone distributions and testing how deviated distribution parameters affect tone identification by native and non-native speakers. Statistical modeling of a Cantonese speech corpus of 68 speakers showed that F0 values of three Cantonese tones follow skew-normal distributions with three parameters: location, shape, and scale. We proceeded to conduct two experiments with 28 Cantonese and 28 Mandarin listeners using both naturally produced tones by 34 Cantonese speakers and manipulated tones with F0 values generated from simulated distributions. A multinomial mixed effects model revealed significant main effects of location and shape parameters. Locally weighted scatterplot smoothing curves also differed dramatically between native and non-native listeners, indicating an effect from long-term F0 distribution representations on tonal identification. The results thus offer useful insights about how parametric representations of phonetic distributions is used in tone identification.
Original languageEnglish
Publication statusPublished - Nov 2019
EventThe 178th Meeting of the Acoustical Society of America - San Diego, United States
Duration: 1 Dec 2019 → …


ConferenceThe 178th Meeting of the Acoustical Society of America
Country/TerritoryUnited States
CitySan Diego
Period1/12/19 → …


Dive into the research topics of 'A quantitative model of talker normalization by native and non-native speakers'. Together they form a unique fingerprint.

Cite this