Human-Intention Prediction with Visual-Language Model

Yongshi Liang, Pai Zheng

Research output: Chapter in book / Conference proceedingConference article published in proceeding or bookAcademic researchpeer-review

Abstract

Human-intention prediction is an important part of human-machine interaction wildly utilized in industrial intelligent systems. In recent years, large language models have expanded to the image task with outstanding performance, leading to an increasing attraction to the application of multimodal Large Language Models. However, the exploration of visual-language models in human-intention prediction is still limited. To address this gap, this paper investigates the effectiveness of visual-language models in predicting human intentions and successfully transfers the knowledge in LLMs to downstream classification tasks. Finally, this paper takes traffic scenarios as an example to validate the feasibility of the video-LLaMA model in predicting pedestrian behavior intentions.

Original languageEnglish
Title of host publication2024 International Conference on Automation in Manufacturing, Transportation and Logistics, ICaMaL 2024
PublisherInstitute of Electrical and Electronics Engineers Inc.
Number of pages7
ISBN (Electronic)9798350378658
ISBN (Print)9798350378665
DOIs
Publication statusPublished - Aug 2024
Event2024 International Conference on Automation in Manufacturing, Transportation and Logistics, ICaMaL 2024 - Hong Kong, Hong Kong
Duration: 7 Aug 20249 Aug 2024

Publication series

Name2024 International Conference on Automation in Manufacturing, Transportation and Logistics, ICaMaL 2024

Conference

Conference2024 International Conference on Automation in Manufacturing, Transportation and Logistics, ICaMaL 2024
Country/TerritoryHong Kong
CityHong Kong
Period7/08/249/08/24

Keywords

  • Human-intention prediction
  • Human-machine interaction
  • Pedestrian intention
  • Visual-language model

ASJC Scopus subject areas

  • Strategy and Management
  • Artificial Intelligence
  • Computer Vision and Pattern Recognition
  • Information Systems and Management
  • Management Science and Operations Research
  • Control and Optimization
  • Modelling and Simulation
  • Transportation

Fingerprint

Dive into the research topics of 'Human-Intention Prediction with Visual-Language Model'. Together they form a unique fingerprint.

Cite this