Abstract
Dementia is a severe cognitive impairment that affects the health of older adults and creates a burden on their families and caretakers. This paper analyzes diverse hand-crafted features extracted from spoken languages and selects the most discriminative ones for dementia detection. Recently, the performance of dementia detection has been significantly improved by utilizing Transformer-based models that automatically capture the structural and linguistic properties of spoken languages. We investigate Transformer-based features and propose an end-to-end system for dementia detection. We also explore recent ASR and representation learning frameworks, such as Wav2vec 2.0 and Hubert, for transcribing a Cantonese corpus that contains recordings of older adults describing the rabbit story. We investigate using disfluency patterns (DP) in spontaneous speech to enhance the recognized word sequences for the Transformer-based feature extractor. Results show that fine-tuning the feature extractor using the enhanced word sequences can improve dementia detection performance.
Original language | English |
---|---|
Title of host publication | ICASSP |
Publisher | Institute of Electrical and Electronics Engineers Inc. |
Pages | 1-5 |
Number of pages | 5 |
Publication status | Published - Jun 2023 |