Abstract
In recent years, phonetic sciences has hosted several debates about the best way to statistically analyze data. The main discussion has been about moving away from analyses of variance (ANOVAs) to linear mixed effects models. Mixed models have the advantage both of allowing for including all data points produced by a participant (instead of computing means for each participant) and accounting for both by-participant and by-item variance. However, plotting of data has not always followed this trend. Often researchers plot participant means and standard error (as based on the number of participants), which, while potentially representative of the data used for an ANOVA, do not match the data used for a mixed effects model. The present paper discusses the shortcomings of traditional data visualization practices, solutions to these shortcomings that have been discussed in recent years, and the special challenges that come with trying to extend these solutions to phonetic data with crossed (within-participant and within-item) designs. For each of the problems discussed, we provide examples with simulated data to demonstrate how different plotting techniques can correctly, or incorrectly, represent the underlying structure of data. Ultimately we conclude that there is no single type of plot that can show everything one needs to know about this type of data, and we advocate for an approach that involves using different types of plots throughout data analysis, and making data publicly available.
Original language | English |
---|---|
Pages (from-to) | 56-69 |
Number of pages | 14 |
Journal | Journal of Phonetics |
Volume | 70 |
DOIs | |
Publication status | Published - 1 Sept 2018 |
Keywords
- Data visualization
- Linear mixed effects models
- Random effects
- Repeated measures
ASJC Scopus subject areas
- Language and Linguistics
- Linguistics and Language
- Speech and Hearing