Toward Proactive Human–Robot Collaborative Assembly: A Multimodal Transfer-Learning-Enabled Action Prediction Approach

Shufei Li, Pai Zheng, Junming Fan, Lihui Wang

Research output: Journal article publicationJournal articleAcademic researchpeer-review

56 Citations (Scopus)


Human-robot collaborative assembly (HRCA) is vital for achieving high-level flexible automation for mass personalization in today's smart factories. However, existing works in both industry and academia mainly focus on the adaptive robot planning, while seldom consider human operator's intentions in advance. Hence, it hinders the HRCA transition toward a proactive manner. To overcome the bottleneck, this article proposes a multimodal transfer-learning-enabled action prediction approach, serving as the prerequisite to ensure the proactive HRCA. First, a multimodal intelligence-based action recognition approach is proposed to predict ongoing human actions by leveraging the visual stream and skeleton stream with short-time input frames. Second, a transfer-learning-enabled model is adapted to transfer learnt knowledge from daily activities to industrial assembly operations rapidly for online operator intention analysis. Third, a dynamic decision-making mechanism, including robotic decision and motion control, is described to allow mobile robots to assist operators in a proactive manner. Finally, an aircraft bracket assembly task is demonstrated in the laboratory environment, and the comparative study result shows that the proposed approach outperforms other state-of-the-art ones for efficient action prediction.

Original languageEnglish
Pages (from-to)8579-8588
Number of pages10
JournalIEEE Transactions on Industrial Electronics
Issue number8
Publication statusPublished - 1 Aug 2022


  • Action recognition
  • human-robot collaboration
  • multimodal intelligence
  • transfer learning

ASJC Scopus subject areas

  • Control and Systems Engineering
  • Electrical and Electronic Engineering


Dive into the research topics of 'Toward Proactive Human–Robot Collaborative Assembly: A Multimodal Transfer-Learning-Enabled Action Prediction Approach'. Together they form a unique fingerprint.

Cite this