TY - GEN
T1 - How Do We Team Up? Human-Machine Co-driving Style Assessment Through Visual Dynamic Analysis and Vision-Language Model
AU - Zhang, Zhuorui
AU - Li, Donglin
AU - Sun, Yiteng
AU - Lee, Ching Hung
AU - Feng, Shanshan
AU - Li, Fan
N1 - Publisher Copyright:
© The Author(s), under exclusive license to Springer Nature Switzerland AG 2025.
PY - 2025/5
Y1 - 2025/5
N2 - As autonomous driving technology advances, understanding human-machine co-driving styles becomes increasingly crucial. The integration of autonomous systems influences traditional human driving styles, resulting in diverse co-driving dynamics. This paper focuses on generating comprehensive driver behavior reports using Vision-Language Model (VLM) to assess human-machine co-driving styles. The assessment is challenged by complex driving scenarios, individual differences, and unclear style indicators. To address these challenges, we introduce the Adaptive Virtual Co-driving Assessing (AVCA) framework. This framework employs a virtual reality platform and combines deep learning models with VLMs. By simulating critical driving scenarios in virtual environments, the framework enables efficient and cost-effective collection of multimodal signals. Deep learning models process temporal and visual data to identify key factors impacting driving safety and efficiency, such as vehicle dynamics and visual attention distribution. To enhance interpretability, the framework integrates these feature abstractions with Co-driving Assessing Thoughts, allowing VLMs to generate driver behavior reports that are clear and actionable for human drivers. Testing results demonstrate the framework’s advanced capabilities in multimodal signal extraction and analysis across diverse participants. By integrating multimodal fusion with model collaboration, the AVCA framework provides personalized assessments and feedback, fostering safer and more efficient driving practices.
AB - As autonomous driving technology advances, understanding human-machine co-driving styles becomes increasingly crucial. The integration of autonomous systems influences traditional human driving styles, resulting in diverse co-driving dynamics. This paper focuses on generating comprehensive driver behavior reports using Vision-Language Model (VLM) to assess human-machine co-driving styles. The assessment is challenged by complex driving scenarios, individual differences, and unclear style indicators. To address these challenges, we introduce the Adaptive Virtual Co-driving Assessing (AVCA) framework. This framework employs a virtual reality platform and combines deep learning models with VLMs. By simulating critical driving scenarios in virtual environments, the framework enables efficient and cost-effective collection of multimodal signals. Deep learning models process temporal and visual data to identify key factors impacting driving safety and efficiency, such as vehicle dynamics and visual attention distribution. To enhance interpretability, the framework integrates these feature abstractions with Co-driving Assessing Thoughts, allowing VLMs to generate driver behavior reports that are clear and actionable for human drivers. Testing results demonstrate the framework’s advanced capabilities in multimodal signal extraction and analysis across diverse participants. By integrating multimodal fusion with model collaboration, the AVCA framework provides personalized assessments and feedback, fostering safer and more efficient driving practices.
KW - Driver Behavior Report
KW - Human-Machine Co-driving
KW - Vision-Language Model
UR - https://www.scopus.com/pages/publications/105009208516
U2 - 10.1007/978-3-031-93733-0_19
DO - 10.1007/978-3-031-93733-0_19
M3 - Conference article published in proceeding or book
AN - SCOPUS:105009208516
SN - 9783031937323
T3 - Lecture Notes in Computer Science
SP - 287
EP - 304
BT - Cross-Cultural Design - 17th International Conference, CCD 2025, Held as Part of the 27th HCI International Conference, HCII 2025, Proceedings
A2 - Rau, Pei-Luen Patrick
PB - Springer Science and Business Media Deutschland GmbH
T2 - 17th International Conference on Cross-Cultural Design, CCD 2025, held as part of the 27th HCI International Conference, HCII 2025
Y2 - 22 June 2025 through 27 June 2025
ER -