Abstract
Aviation accident investigation is crucial for preventing future accidents. However, traditional investigations in general aviation (GA) are expert-dependent and time-consuming. This study explores the potential of large language models (LLMs) to expedite this process by inferring human errors from witness narratives. Despite their promise, LLMs still struggle with domain-specific reasoning. To address this, we proposed a novel HFACS-CoT prompt that integrates the Human Factors Analysis and Classification System (HFACS) with Chain of Thought (CoT) reasoning, guiding LLMs to infer the pilot's unsafe acts and preconditions in a multi-step, two-stage process. HFACS-CoT+ further refines this prompt by sequentially guiding LLMs through each step and replacing textual instructions with programmatic logic statements. A new HFACS-labeled GA accident dataset was developed to support GA safety research as well as validate our proposed prompts. Using GPT-4o with the selected dataset, we found that HFACS-CoT significantly enhances LLMs’ ability to infer human errors, outperforming basic zero-shot, basic few-shot, auto-CoT and plan-and-solve prompts. HFACS-CoT+ further improves inference of preconditions and addresses deficiencies in answering logic. Moreover, comparative evaluations indicate that LLM surpass human experts in inferring certain human errors. This study highlights the benefits of integrating domain knowledge into prompt design and the potential of LLMs in GA accident investigations.
Original language | English |
---|---|
Article number | 126422 |
Journal | Expert Systems with Applications |
Volume | 269 |
DOIs | |
Publication status | Published - 15 Apr 2025 |
Keywords
- Accident investigation
- Chain of thought
- General aviation
- HFACS
- Large language models
- Witness narratives
ASJC Scopus subject areas
- General Engineering
- Computer Science Applications
- Artificial Intelligence