Skip to main navigation Skip to search Skip to main content

SDR: Stackelberg-based deep reinforcement learning for multi-skill spatiotemporal task allocation in AIoT systems

  • Yu Li
  • , Fengya Yin
  • , Yihao Zheng
  • , Wenjian Xu (Corresponding Author)
  • , Jung Yoon Kim (Corresponding Author)
  • , Zhe Peng (Corresponding Author)

Research output: Journal article publicationJournal articleAcademic researchpeer-review

Abstract

In AIoT-based multi-skill environments, task allocation is a complex process that involves multiple constraints and worker acceptance rates. However, existing studies often overlook worker acceptance rates and fail to properly balance the interests of both workers and requesters. To address this, we propose SDR, a system based on a dual Dueling DQN model in deep reinforcement learning, designed to maximize the long-term utility of all participants while considering user acceptance rates and demand constraints. SDR introduces targeted enhancements in state, action, and reward design to balance acceptance rates with spatiotemporal and skill constraints, optimizing both immediate and long-term task allocation performance. To resolve conflicts of interest, we integrate Pareto optimization into the Q-value computation and action selection. For scenarios where interests align, we adopt Stackelberg game theory to refine the reward mechanism. Extensive simulations on both synthetic and real-world datasets validate the effectiveness of our approach in improving task allocation and pricing strategies.

Original languageEnglish
Article number108283
Number of pages14
JournalComputer Communications
Volume242
DOIs
Publication statusPublished - 1 Oct 2025

Keywords

  • AIoT
  • Deep Q-network
  • Incentive mechanism
  • Stackelberg game
  • Task allocation

ASJC Scopus subject areas

  • Computer Networks and Communications

Fingerprint

Dive into the research topics of 'SDR: Stackelberg-based deep reinforcement learning for multi-skill spatiotemporal task allocation in AIoT systems'. Together they form a unique fingerprint.

Cite this