Skip to main navigation Skip to search Skip to main content

Vision-Language Model-Based Human-Guided Mobile Robot Navigation in an Unstructured Environment for Human-Centric Smart Manufacturing

  • Tian Wang
  • , Junming Fan (Corresponding Author)
  • , Pai Zheng (Corresponding Author)
  • , Ruqiang Yan
  • , Lihui Wang

Research output: Journal article publicationJournal articleAcademic researchpeer-review

Abstract

In smart manufacturing, autonomous mobile robots play an indispensable role in conducting inspection and material handling operations, yet they face significant limitations regarding adaptability and resilience within unstructured environments. Vision and language navigation (VLN), a human-guided navigation paradigm, emerges as a compelling solution to these challenges. Nevertheless, VLN’s practical implementation is constrained by limited task generalization capabilities, inadequate response to diverse linguistic commands, and insufficient consideration of sensor-induced noise in environmental perception. This research addresses these limitations by introducing an innovative vision-language model (VLM)-based human-guided mobile robot navigation approach in an unstructured environment for human-centric smart manufacturing (HSM). This approach encompasses robust Three-dimensional (3D) scene reconstruction through advanced point cloud techniques, zero-shot semantic segmentation via a VLM, and natural language processing through a large language model (LLM) to interpret instructions and generate control code for navigation. The system’s efficacy is validated through extensive experiments in an unstructured manufacturing setup.

Original languageEnglish
Pages (from-to)ecopy
Number of pages11
JournalEngineering
DOIs
Publication statusAccepted/In press - 2025

Keywords

  • Human-centric smart manufacturing
  • Human-robot interaction
  • Large language model
  • Mobile robot navigation
  • Vision-language model

ASJC Scopus subject areas

  • Environmental Engineering
  • General Computer Science
  • Materials Science (miscellaneous)
  • General Chemical Engineering
  • Energy Engineering and Power Technology
  • General Engineering

Fingerprint

Dive into the research topics of 'Vision-Language Model-Based Human-Guided Mobile Robot Navigation in an Unstructured Environment for Human-Centric Smart Manufacturing'. Together they form a unique fingerprint.

Cite this