Abstract
In recent years, voice-AI systems have seen significant improvement in intelligibility and naturalness, but the human experience when talking to a machine is still remarkably different from the experience of talking to a fellow human. In this paper, we explore one dimension of such differences, i.e., the occurrence of disfluency in machine speech and how it may impact human listeners’ processing and memory of linguistic information. We conducted a humanmachine conversation task in Mandarin Chinese using a humanoid social robot (Furhat), with different types of machine speech (pre-recorded natural speech vs. synthesized speech, fluent vs. disfluent). During the task, the human interlocutor was tested in terms of how well they remembered the information presented by the robot. The results showed that disfluent speech (surrounded by “um”/”uh”) boosted memory retention only in pre-recorded speech for a retelling
task but not in synthesized speech. We discuss the implications of current findings and possible directions of future work.
task but not in synthesized speech. We discuss the implications of current findings and possible directions of future work.
Original language | English |
---|---|
Title of host publication | Proceedings of the Conference : Human Perspectives on Spoken Human-Machine Interaction |
Editors | Sarah Warchhold, Daniel Duran, Iona Gessinger, Eran Raveh |
Pages | 52-57 |
DOIs | |
Publication status | Published - Nov 2021 |
Event | FRIAS Junior Researcher Conference on Human Perspectives on Spoken Human-Machine Interaction (SpoHuMa21) - Duration: 15 Nov 2021 → 17 Nov 2021 https://freidok.uni-freiburg.de/data/223814 |
Conference
Conference | FRIAS Junior Researcher Conference on Human Perspectives on Spoken Human-Machine Interaction (SpoHuMa21) |
---|---|
Period | 15/11/21 → 17/11/21 |
Internet address |