Creating a spatially continuous air temperature dataset for Taiwan using thermal remote-sensing data and machine learning algorithms

Duy Phien Tran, Yuei An Liou

Research output: Contribution to journalArticlepeer-review

7 Scopus citations

Abstract

Weather stations can provide accurate and high temporal resolution air temperature (Ta) measurements, but their limited spatial coverage due to sparse distribution poses an issue and challenge. However, satellite data can offer land surface temperature (LST) observations with high spatial coverage, which have a strong relationship with Ta, making them ideal for enhancing Ta estimation. This study uses satellite-derived and auxiliary data to create a monthly mean Ta dataset with a 1 km resolution over Taiwan from 2003 to 2020. We employed three machine learning (ML) algorithms and seven different datasets comprising 12 explanatory variables with LST obtained from the MODIS to find the optimal combination of algorithm and dataset for Ta estimation in Taiwan. We applied recursive feature elimination (RFE) to reduce the model complexity and overfitting issues. For model assessment, we used five-fold cross-validation to evaluate the ML models, and indicators such as the coefficient of determination (R2), mean absolute error (MAE), and root mean square of error (RMSE) were employed. The results show that the XGB regressor performed the best among the three models with the highest accuracy. The RFE using the XGB model suggested eight selected variables, including nighttime LST, daytime LST, elevation, longitude, latitude, distance to the sea, month, and year. Based on the variance importance analysis, nighttime LST was the most crucial variable, followed by daytime LST and month. We found that the final monthly Ta dataset using the XGB model had an excellent five-fold cross-validated performance (R2 = 0.986, MAE = 0.477 °C, and RMSE = 0.639 °C). Furthermore, the XGB model not only performed well throughout all four seasons but also had high and consistent accuracy across months, years, and subsets, indicating its potential for accurately estimating Ta in Taiwan's complex topographic features with varying climate conditions. The resulting monthly Ta dataset created by our model can be an essential input for environmental studies.

Original languageEnglish
Article number111469
JournalEcological Indicators
Volume158
DOIs
StatePublished - Jan 2024

Keywords

  • Air temperature
  • Land surface temperature
  • MODIS
  • Machine learning
  • XGB

Fingerprint

Dive into the research topics of 'Creating a spatially continuous air temperature dataset for Taiwan using thermal remote-sensing data and machine learning algorithms'. Together they form a unique fingerprint.

Cite this