Abstract
Domestic violence (DV) survivors are seeking information through online posts with the development of social media. Identifying these help-seeking behaviors is crucial for delivering timely and effective support. But the detection and extraction of domestic violence survivors' information need from Cantonese social media can be challenging due to the low-resource status of the diaspora language and less attention about the DV topic on mainstream forums. Traditional methods of identifying survivors necessitate substantial human resources for filtering and accurately discerning their needs, which is both labor-intensive and time-consuming. This research aims to develop an automated system for detecting Cantonese posts related to information-seeking behavior about DV survivors. To streamline the filtering process while ensuring the privacy of sensitive data, the In-Context Code-Mixing (ICM) and other instruction tuning strategies were utilized on light weight Large Language Models. The proposed system filtered and synthesized posts from Cantonese websites and achieved an accuracy score of 0.84 in in-context learning. The ICM results in this paper could exceed those under monolingual instruction and also indicate that a higher code-mixing ratio could lead to improved outcomes in the detection of domestic violence survivors of low-resourced language.
| Original language | English |
|---|---|
| Pages (from-to) | 1792-1793 |
| Number of pages | 2 |
| Journal | Studies in Health Technology and Informatics |
| Volume | 329 |
| DOIs | |
| Publication status | Published - 7 Aug 2025 |
Keywords
- Cantonese
- Domestic violence
- In-context code-mixing
- Large language models
ASJC Scopus subject areas
- Biomedical Engineering
- Health Informatics
- Health Information Management