Skip to main navigation Skip to search Skip to main content

How Do We Team Up? Human-Machine Co-driving Style Assessment Through Visual Dynamic Analysis and Vision-Language Model

  • Zhuorui Zhang
  • , Donglin Li
  • , Yiteng Sun
  • , Ching Hung Lee
  • , Shanshan Feng
  • , Fan Li

Research output: Chapter in book / Conference proceedingConference article published in proceeding or bookAcademic researchpeer-review

Abstract

As autonomous driving technology advances, understanding human-machine co-driving styles becomes increasingly crucial. The integration of autonomous systems influences traditional human driving styles, resulting in diverse co-driving dynamics. This paper focuses on generating comprehensive driver behavior reports using Vision-Language Model (VLM) to assess human-machine co-driving styles. The assessment is challenged by complex driving scenarios, individual differences, and unclear style indicators. To address these challenges, we introduce the Adaptive Virtual Co-driving Assessing (AVCA) framework. This framework employs a virtual reality platform and combines deep learning models with VLMs. By simulating critical driving scenarios in virtual environments, the framework enables efficient and cost-effective collection of multimodal signals. Deep learning models process temporal and visual data to identify key factors impacting driving safety and efficiency, such as vehicle dynamics and visual attention distribution. To enhance interpretability, the framework integrates these feature abstractions with Co-driving Assessing Thoughts, allowing VLMs to generate driver behavior reports that are clear and actionable for human drivers. Testing results demonstrate the framework’s advanced capabilities in multimodal signal extraction and analysis across diverse participants. By integrating multimodal fusion with model collaboration, the AVCA framework provides personalized assessments and feedback, fostering safer and more efficient driving practices.

Original languageEnglish
Title of host publicationCross-Cultural Design - 17th International Conference, CCD 2025, Held as Part of the 27th HCI International Conference, HCII 2025, Proceedings
EditorsPei-Luen Patrick Rau
PublisherSpringer Science and Business Media Deutschland GmbH
Pages287-304
Number of pages18
ISBN (Print)9783031937323
DOIs
Publication statusPublished - May 2025
Event17th International Conference on Cross-Cultural Design, CCD 2025, held as part of the 27th HCI International Conference, HCII 2025 - Gothenburg, Sweden
Duration: 22 Jun 202527 Jun 2025

Publication series

NameLecture Notes in Computer Science
Volume15783 LNCS
ISSN (Print)0302-9743
ISSN (Electronic)1611-3349

Conference

Conference17th International Conference on Cross-Cultural Design, CCD 2025, held as part of the 27th HCI International Conference, HCII 2025
Country/TerritorySweden
CityGothenburg
Period22/06/2527/06/25

Keywords

  • Driver Behavior Report
  • Human-Machine Co-driving
  • Vision-Language Model

ASJC Scopus subject areas

  • Theoretical Computer Science
  • General Computer Science

Fingerprint

Dive into the research topics of 'How Do We Team Up? Human-Machine Co-driving Style Assessment Through Visual Dynamic Analysis and Vision-Language Model'. Together they form a unique fingerprint.

Cite this