Quantization-aware distributed deep reinforcement learning for dynamic multi-robot scheduling

Peng Song, Yichen Xiao, Kaixin Cui, Junzheng Wang, Dawei Shi*

*Corresponding author for this work

Research output: Contribution to journalArticlepeer-review

Abstract

In intelligent port logistics, container stevedoring operations confront escalating challenges in orchestrating fleets of robots, where real-time task scheduling must reconcile high-dimensional state spaces with stringent computational efficiency and dynamically evolving environments. Traditional approaches, categorized as exact methods and approximate metaheuristics, struggle to balance solution quality and real-time responsiveness as task complexity grows exponentially. While recent deep reinforcement learning (DRL) methods improve adaptability in dynamic settings, they suffer from high computational overhead and deployment latency, limiting their practicality in time-sensitive port operations. To address these limitations, this work proposes a distributed deep reinforcement learning (DDRL) framework. This framework leverages the independence between ports to perform action selection and decision-making in parallel, thereby alleviating computational pressure and enhancing operational efficiency. It is especially enhanced with a teammate collaboration model and a greedy MaxNextQ policy, which enables the network to identify and approach promising actions associated with increasing Q-values. To further enhance deployment efficiency, a quantization-aware training (QAT) method is introduced by adding pseudo-quantization nodes and thus reducing quantization-induced errors. The effectiveness of the proposed DDRL algorithm is validated through simulations under three distinct workload scenarios via varying the robot-to-port ratio. The simulation results demonstrate that, compared with centralized DRL approaches, the proposed approach achieves deployment rate improvements of 22.95%, 15.09%, and 23.37%, while simultaneously enhancing objective scores by 5.75%, 6.32%, and 7.05%.

Original languageEnglish
Article number129027
JournalExpert Systems with Applications
Volume296
DOIs
Publication statusPublished - 15 Jan 2026
Externally publishedYes

Keywords

  • Distributed deep reinforcement learning
  • Dynamic task scheduling
  • Multi-robot system
  • Quantization-aware training

Fingerprint

Dive into the research topics of 'Quantization-aware distributed deep reinforcement learning for dynamic multi-robot scheduling'. Together they form a unique fingerprint.

Cite this