A vision-language-guided robotic action planning approach for ambiguity mitigation in human–robot collaborative manufacturing

Junming Fan, Pai Zheng

Research output: Journal article publicationJournal articleAcademic researchpeer-review

12 Citations (Scopus)

Abstract

Human–robot collaboration (HRC) has been recognized as a potent pathway towards mass personalization in the manufacturing sector, by leveraging the synergy of human creativity and robotic precision. Previous approaches rely heavily on visual perception to autonomously comprehend the HRC environment. However, the inherent ambiguity in human–robot communication cannot be consistently neutralized by relying solely on visual cues. With the recently soaring popularity of large language models (LLMs), the consideration of language data as a complementary information source has increasingly drawn research attention, while the application of such large models, particularly within the context of HRC in manufacturing, remains largely under-explored. In response to this gap, a vision-language reasoning approach is proposed to mitigate the communication ambiguity prevalent in human–robot collaborative manufacturing scenarios. A referred object retrieval model is first designed to alleviate the object–reference ambiguity in the human language command. This model is then seamlessly integrated into an LLM-based robotic action planner to achieve an improved HRC performance. The effectiveness of the proposed approach is demonstrated empirically through a series of experiments conducted on the object retrieval model and its application in a human–robot collaborative assembly case.

Original languageEnglish
Pages (from-to)1009-1018
Number of pages10
JournalJournal of Manufacturing Systems
Volume74
DOIs
Publication statusPublished - Jun 2024

Keywords

  • Computer vision
  • Deep learning
  • Human–robot collaboration
  • Smart manufacturing
  • Vision-language reasoning

ASJC Scopus subject areas

  • Control and Systems Engineering
  • Software
  • Hardware and Architecture
  • Industrial and Manufacturing Engineering

Fingerprint

Dive into the research topics of 'A vision-language-guided robotic action planning approach for ambiguity mitigation in human–robot collaborative manufacturing'. Together they form a unique fingerprint.

Cite this