ENHANCING VISUAL-LLM THROUGH PROMPT ENGINEERING AND HYBRID RETRIEVAL-AUGMENTED GENERATION FOR SITE SAFETY COMPLIANCE CHECKING

  • Koi Xiaowen Guo
  • , Peter Kok Yiu Wong
  • , Jack C.P. Cheng
  • , Xingyu Tao
  • , Pak Him Leung

Research output: Chapter in book / Conference proceedingConference article published in proceeding or bookAcademic researchpeer-review

Abstract

The increasing prevalence of safety incidents on construction sites states the urgent need for enhanced monitoring. This study proposes an innovative hybrid Retrieval-Augmented Generation (RAG) algorithm to compliance check accuracy for site images. By integrating the Visual Language Model (VLM), we developed an algorithm capable of mastering domain knowledge without fine-tuning and addressing the limitation of interpreting RAG technology with visual information. A three-phased prompting framework was designed to enhance the VLM's compliance analysis abilities. Experiments based on actual construction site in Hong Kong demonstrated 21.98% increase in retrieval accuracy.

Original languageEnglish
Title of host publicationProceedings of the 2025 European Conference on Computing in Construction and 42nd International CIB W78 Conference on Information Technology in Construction, 2025
EditorsEkaterina Petrova, Marijana Srećković, Pedro Meda, Ranjith K. Soman, Daniel Hall, Jakob Beetz, Jenn McArthur
PublisherEuropean Council on Computing in Construction (EC3)
ISBN (Print)9789083451312
DOIs
Publication statusPublished - Jul 2025
EventEuropean Conference on Computing in Construction, EC3 2025 and 42nd International CIB W78 Conference on IT in Construction, 2025 - Porto, Portugal
Duration: 14 Jul 202517 Jul 2025

Publication series

NameProceedings of the European Conference on Computing in Construction
ISSN (Electronic)2684-1150

Conference

ConferenceEuropean Conference on Computing in Construction, EC3 2025 and 42nd International CIB W78 Conference on IT in Construction, 2025
Country/TerritoryPortugal
CityPorto
Period14/07/2517/07/25

Keywords

  • Construction Site Safety
  • Image-based Monitoring
  • Multimodal Large Language Models
  • Retrieval-Augmented Generation (RAG)

ASJC Scopus subject areas

  • Information Systems
  • Building and Construction

Fingerprint

Dive into the research topics of 'ENHANCING VISUAL-LLM THROUGH PROMPT ENGINEERING AND HYBRID RETRIEVAL-AUGMENTED GENERATION FOR SITE SAFETY COMPLIANCE CHECKING'. Together they form a unique fingerprint.

Cite this