A deep reinforcement learning system for the allocation of epidemic prevention materials based on DDPG

Kotcharat Kitchat, Meng Hong Lin, Hao Sheng Chen, Min Te Sun, Kazuya Sakai, Wei Shinn Ku, Thattapon Surasak

Research output: Contribution to journalArticlepeer-review

2 Scopus citations

Abstract

COVID-19 has spread rapidly around the world since the end of 2019. Consequently, the demand for epidemic prevention materials (e.g., medical-grade masks) has increased drastically. The medical-grade masks have become a necessary item for everyone. Referring to the shortage of medical-grade masks in Taiwan, the government collected and managed them at the early stage and sold them for a fixed price. However, the government must distribute masks around the country to prevent the movement of people who can cause infection with COVID-19. Moreover, the demands for medical supplies during the pandemic have become more complex and dynamic. This study proposes a robust system for allocating epidemic prevention materials. The proposed approach adopts the reinforcement learning framework, which takes the daily supply and demand for masks as the environment, with the Deep Deterministic Policy Gradient (DDPG) algorithm for agent updates and the daily shortage as rewards and punishments. The proposed system is compared with the traditional machine learning approach used for supply chain demand forecasting through experiments. The results indicate that our proposed method is superior regarding the number of pharmacies with over-stocked masks and rewards. Moreover, our proposed system performs consistently under different total numbers of masks.

Original languageEnglish
Article number122763
JournalExpert Systems with Applications
Volume242
DOIs
StatePublished - 15 May 2024

Keywords

  • Deep deterministic policy gradient
  • Deep learning
  • Medical-grade masks
  • Reinforcement learning
  • Supply chain management

Fingerprint

Dive into the research topics of 'A deep reinforcement learning system for the allocation of epidemic prevention materials based on DDPG'. Together they form a unique fingerprint.

Cite this