Enhancing Human Detection in Occlusion-Heavy Disaster Scenarios: A Visibility-Enhanced DINO (VE-DINO) Model with Reassembled Occlusion Dataset

Zi An Zhao, Shidan Wang, Min Xin Chen, Ye Jiao Mao, Andy Chi Ho Chan, Derek Ka Hei Lai, Duo Wai Chi Wong, James Chung Wai Cheung

Research output: Journal article publicationJournal articleAcademic researchpeer-review

2 Citations (Scopus)

Abstract

Highlights: What are the main findings? Visibility-Enhanced DINO (VE-DINO): This modified DINO model was designed to identify partially obscured individuals in disaster scenes. VE-DINO enhances the identification process by incorporating key point information of body parts, allowing for more accurate recognition even when parts of the body are obscured. Additionally, the model introduces a specialized visibility-aware loss function that assigns weights to different body parts based on their visibility. VE-DINO has demonstrated superior performance compared to the original DINO in challenging conditions. Disaster Occlusion Detection Dataset (DODD): A newly assembled dataset of disaster scenes with occluded individuals, crucial for testing the visibility-enhanced DINO model’s improved performance in detecting partially visible people. What is the implication of the main finding? Faster victim location: The VE-DINO model that could be deployed in an unmanned aerial vehicle (UAV) can enable quicker and more accurate identification of obscured individuals in disaster scenes, accelerating rescue efforts. Natural disasters create complex environments where effective human detection is both critical and challenging, especially when individuals are partially occluded. While recent advancements in computer vision have improved detection capabilities, there remains a significant need for efficient solutions that can enhance search-and-rescue (SAR) operations in resource-constrained disaster scenarios. This study modified the original DINO (Detection Transformer with Improved Denoising Anchor Boxes) model and introduced the visibility-enhanced DINO (VE-DINO) model, designed for robust human detection in occlusion-heavy environments, with potential integration into SAR system. VE-DINO enhances detection accuracy by incorporating body part key point information and employing a specialized loss function. The model was trained and validated using the COCO2017 dataset, with additional external testing conducted on the Disaster Occlusion Detection Dataset (DODD), which we developed by meticulously compiling relevant images from existing public datasets to represent occlusion scenarios in disaster contexts. The VE-DINO achieved an average precision of 0.615 at IoU 0.50:0.90 on all bounding boxes, outperforming the original DINO model (0.491) in the testing set. The external testing of VE-DINO achieved an average precision of 0.500. An ablation study was conducted and demonstrated the robustness of the model subject when confronted with varying degrees of body occlusion. Furthermore, to illustrate the practicality, we conducted a case study demonstrating the usability of the model when integrated into an unmanned aerial vehicle (UAV)-based SAR system, showcasing its potential in real-world scenarios.

Original languageEnglish
Article number12
JournalSmart Cities
Volume8
Issue number1
DOIs
Publication statusPublished - 16 Jan 2025

Keywords

  • deep learning
  • DINO
  • Disaster Occlusion Detection Dataset (DODD)
  • human detection
  • natural disasters
  • occlusion detection
  • resource-constrained environments
  • SAR operations
  • UAVs

ASJC Scopus subject areas

  • Urban Studies
  • Artificial Intelligence
  • Electrical and Electronic Engineering

Fingerprint

Dive into the research topics of 'Enhancing Human Detection in Occlusion-Heavy Disaster Scenarios: A Visibility-Enhanced DINO (VE-DINO) Model with Reassembled Occlusion Dataset'. Together they form a unique fingerprint.

Cite this