TY - GEN
T1 - Theophany: Multimodal Speech Augmentation in Instantaneous Privacy Channels
AU - Kumar, Abhishek
AU - Braud, Tristan
AU - Lee, Lik Hang
AU - Hui, Pan
N1 - Publisher Copyright:
© 2021 ACM.
PY - 2021/10/17
Y1 - 2021/10/17
N2 - Many factors affect speech intelligibility in face-to-face conversations. These factors lead conversation participants to speak louder and more distinctively, exposing the content to potential eavesdroppers. To address these issues, we introduce Theophany, a privacy-preserving framework for augmenting speech. Theophany establishes ad-hoc social networks between conversation participants to exchange contextual information, improving speech intelligibility in real-time. At the core of Theophany, we develop the first privacy perception model that assesses the privacy risk of a face-to-face conversation based on its topic, location, and participants. This framework allows to develop any privacy-preserving application for face-to-face conversation. We implement the framework within a prototype system that augments the speaker's speech with real-life subtitles to overcome the loss of contextual cues brought by mask-wearing and social distancing during the COVID-19 pandemic. We evaluate Theophany through a user survey and a user study on 53 and 17 participants, respectively. Theophany's privacy predictions match the participants' privacy preferences with an accuracy of 71.26%. Users considered Theophany to be useful to protect their privacy (3.88/5), easy to use (4.71/5), and enjoyable to use (4.24/5). We also raise the question of demographic and individual differences in the design of privacy-preserving solutions.
AB - Many factors affect speech intelligibility in face-to-face conversations. These factors lead conversation participants to speak louder and more distinctively, exposing the content to potential eavesdroppers. To address these issues, we introduce Theophany, a privacy-preserving framework for augmenting speech. Theophany establishes ad-hoc social networks between conversation participants to exchange contextual information, improving speech intelligibility in real-time. At the core of Theophany, we develop the first privacy perception model that assesses the privacy risk of a face-to-face conversation based on its topic, location, and participants. This framework allows to develop any privacy-preserving application for face-to-face conversation. We implement the framework within a prototype system that augments the speaker's speech with real-life subtitles to overcome the loss of contextual cues brought by mask-wearing and social distancing during the COVID-19 pandemic. We evaluate Theophany through a user survey and a user study on 53 and 17 participants, respectively. Theophany's privacy predictions match the participants' privacy preferences with an accuracy of 71.26%. Users considered Theophany to be useful to protect their privacy (3.88/5), easy to use (4.71/5), and enjoyable to use (4.24/5). We also raise the question of demographic and individual differences in the design of privacy-preserving solutions.
KW - assistive technology
KW - augmented reality
KW - human augmentation
KW - multi-modal speech augmentation
KW - speech intelligibility
KW - user privacy
UR - http://www.scopus.com/inward/record.url?scp=85119021820&partnerID=8YFLogxK
U2 - 10.1145/3474085.3475507
DO - 10.1145/3474085.3475507
M3 - Conference article published in proceeding or book
AN - SCOPUS:85119021820
T3 - MM 2021 - Proceedings of the 29th ACM International Conference on Multimedia
SP - 2056
EP - 2064
BT - MM 2021 - Proceedings of the 29th ACM International Conference on Multimedia
PB - Association for Computing Machinery, Inc
T2 - 29th ACM International Conference on Multimedia, MM 2021
Y2 - 20 October 2021 through 24 October 2021
ER -