Recurrent Learning on PM2.5 Prediction Based on Clustered Airbox Dataset

Chia Yu Lo, Wen Hsing Huang, Ming Feng Ho, Min Te Sun, Ling Jyh Chen, Kazuya Sakai, Wei Shinn Ku

Research output: Contribution to journalArticlepeer-review

6 Scopus citations


The progress of industrial development naturally leads to the demand for more electrical power. Unfortunately, due to the fear of the safety of nuclear power plants, many countries have relied on thermal power plants, which will cause more air pollutants during the process of coal burning. This phenomenon as well as increased vehicle emissions around us, have constituted the primary factors of serious air pollution. Inhaling too much particulate air pollution may lead to respiratory diseases and even death, especially PM2.5. By predicting the air pollutant concentration, people can take precautions to avoid overexposure to air pollutants. Consequently, accurate PM2.5 prediction becomes more important. In this study, we propose a PM2.5 prediction system, which utilizes the dataset from EdiGreen Airbox and Taiwan EPA. Autoencoder and Linear interpolation are adopted for solving the missing value problem. Spearman's correlation coefficient is used to identify the most relevant features for PM2.5. Two prediction models (i.e., LSTM and LSTM based on K-means) are implemented which predict PM2.5 value for each Airbox device. To assess the performance of the model prediction, the daily average error and the hourly average accuracy for the duration of a week are calculated. The experimental results show that LSTM based on K-means has the best performance among all methods. Therefore, LSTM based on K-means is chosen to provide real-time PM2.5 prediction through the Linebot.

Original languageEnglish
Pages (from-to)4994-5008
Number of pages15
JournalIEEE Transactions on Knowledge and Data Engineering
Issue number10
StatePublished - 1 Oct 2022


  • Air quality
  • clustering
  • prediction
  • recurrent neural network


Dive into the research topics of 'Recurrent Learning on PM2.5 Prediction Based on Clustered Airbox Dataset'. Together they form a unique fingerprint.

Cite this