Abstract
Recent studies have demonstrated the effectiveness of speaker disentanglement in mitigating the interference caused by speaker features in speech-based depression detection. However, the inherent entanglement between depression features and speaker features poses challenges to depression detection. In this study, we propose a mutual information-based speaker-invariant depression detector (MI-SIDD) that aims to promote independence between depression and speaker features to facilitate speaker disentanglement. Specifically, we disentangle the speaker features using a vanilla autoencoder with a well-tuned bottleneck layer and minimize the mutual information between depression and speaker features using a conditional mutual information constraint. Experimental results demonstrate the effectiveness of speaker disentanglement and the promotion of independence between depression and speaker features. Our MI-SIDD model achieves competitive performance compared to state-of-the-art methods on the DAIC-WOZ dataset.
| Original language | English |
|---|---|
| Title of host publication | ICASSP 2024-2024 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) |
| Publisher | IEEE |
| Pages | 10191-10195 |
| Number of pages | 5 |
| ISBN (Electronic) | 979-8-3503-4485-1 |
| ISBN (Print) | 979-8-3503-4486-8 |
| DOIs | |
| Publication status | Published - Apr 2024 |
Fingerprint
Dive into the research topics of 'Promoting Independence of Depression and Speaker Features for Speaker Disentanglement in Speech-Based Depression Detection'. Together they form a unique fingerprint.Cite this
- APA
- Author
- BIBTEX
- Harvard
- Standard
- RIS
- Vancouver