Integrated and Enhanced Pipeline System to Support Spoken Language Analytics for Screening Neurocognitive Disorders

Helen Meng, Brian Mak, Man Wai Mak, Helene Fung, Xianmin Gong, Timothy Kwok, Xunying Liu, Vincent Mok, Patrick Wong, Jean Woo, Xixin Wu, Ka Ho Wong, Sean Shensheng Xu, Naijun Zheng, Ranzo Huang, Jiawen Kang, Xiaoquan Ke, Junan Li, Jinchao Li, Yi Wang

Research output: Journal article publicationConference articleAcademic researchpeer-review

2 Citations (Scopus)

Abstract

This paper presents an enhanced pipeline system for automated screening of neurocognitive disorders, e.g. Alzheimer's Disease (AD), using spoken language technologies. To ensure local relevance, the pipeline is applied to two-way interactions between clinical assessors and older adult participants in spoken Cantonese, the predominant language used in Hong Kong. The pipeline includes: (i) Speaker diarization using speaker-turn-aware scoring to capture the temporal structure of conversations. (ii) ASR using XLS-R wav2vec 2.0 models further pre-trained on Cantonese speech data and fine-tuned. (iii) Language modelling using RoBERTa with further fine-tuning. (iv) AD screening with neural network classification. A reference benchmark is obtained using the ADReSS corpus where no diarization is needed, and the partial pipeline attained a competitive detection accuracy of 87.5%.

Original languageEnglish
Pages (from-to)1713-1717
Number of pages5
JournalProceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH
Volume2023-August
DOIs
Publication statusPublished - Aug 2023
Event24th International Speech Communication Association, Interspeech 2023 - Dublin, Ireland
Duration: 20 Aug 202324 Aug 2023

Keywords

  • dementia
  • diarization
  • NCD detection
  • neurocognitive disorder
  • speech recognition

ASJC Scopus subject areas

  • Language and Linguistics
  • Human-Computer Interaction
  • Signal Processing
  • Software
  • Modelling and Simulation

Fingerprint

Dive into the research topics of 'Integrated and Enhanced Pipeline System to Support Spoken Language Analytics for Screening Neurocognitive Disorders'. Together they form a unique fingerprint.

Cite this