Articles | Volume 17, issue 1
https://doi.org/10.5194/ms-17-671-2026
https://doi.org/10.5194/ms-17-671-2026
Research article
 | 
25 Jun 2026
Research article |  | 25 Jun 2026

Curriculum-learning-driven hierarchical multi-agent deep reinforcement learning for collaborative scheduling in complex supply chain networks

Jingya Dong, Han Zhao, Suyi Zhao, Yijie Wang, Mengfan Guo, Chunhe Song, and Mingliang Xu

Cited articles

Amirteimoori, A., Tirkolaee, E. B., Simic, V., and Weber, G.-W.: A parallel heuristic for hybrid job shop scheduling problem considering conflict-free AGV routing, Swarm Evol. Comput., 79, 101312, https://doi.org/10.1016/j.swevo.2023.101312, 2023. a
Atasagun, G. C. and Karaoğlan, İ.: Integrated production and outbound distribution scheduling problem with multiple facilities/vehicles and perishable items, Appl. Soft Comput., 166, 112144, https://doi.org/10.1016/j.asoc.2024.112144, 2024. a
Chang, X., Jia, X., and Hu, H.: Energy-efficient and self-adaptive AGV scheduling approach based on hierarchical reinforcement learning for flexible shop floor, Comput. Ind. Eng., 205, 111140, https://doi.org/10.1016/j.cie.2025.111140, 2025. a
Chen, J., Zhang, Y., Xu, Y., Ma, H., Yang, H., Song, J., Wang, Y., and Wu, Y.: Variational automatic curriculum learning for sparse-reward cooperative multi-agent problems, in: Advances in Neural Information Processing Systems (NeurIPS), 34, 2021. a
Di, Y., Deng, L., and Zhang, L.: A collaborative-learning multi-agent reinforcement learning method for distributed hybrid flow shop scheduling problem, Swarm Evol. Comput., 91, 101764, https://doi.org/10.1016/j.swevo.2024.101764, 2024. a
Download
Short summary
Modern supply chains must coordinate customer orders, factory choices, and delivery routes while conditions change quickly. This study developed a learning-based scheduling method that breaks the whole task into connected decisions and trains them step by step. Tests in simulated networks showed faster learning, shorter completion times, and better performance in unfamiliar settings. The results suggest a practical way to improve coordination in production and delivery systems.
Share