TY - JOUR
T1 - Extrapolative Machine Learning for Accurate Efficiency Prediction in Non-Fullerene Ternary Organic Solar Cells
T2 - Leveraging Computable Molecular Descriptors in High-Throughput Virtual Screening
AU - Liao, Jian Ming
AU - Tsai, Hui Hsu Gavin
N1 - Publisher Copyright:
© 2024 Wiley-VCH GmbH.
PY - 2024/7
Y1 - 2024/7
N2 - Adding a third component to binary organic solar cells (OSCs) enhances ternary OSCs, boosting power conversion efficiency (PCE). However, developing and optimizing appropriate donors, acceptors, and ternary materials remains a complex and demanding task. This study presents four machine-learning (ML) predictive models using XGBoost and ANN approaches, utilizing both experimental and DFT-derived HOMO and LUMO levels for efficient high-throughput virtual screening (HTVS) of top candidates based on PCE. Two distinct latent databases were employed for HTVS: one consisting of 429 413 uniquely recombined ternary OSC systems from experimentally available data, and another comprising ≈2.3 million unique donor molecules from the Harvard Clean Energy Project database (CEPDB). The four ML models demonstrated notable predictive accuracy for PCE on a test dataset containing molecules closely aligned with the training set (interpolation). However, the XGBoost model showed constrained extrapolative ability for molecules significantly divergent from those in the training dataset. In contrast, the ANN models displayed a robust extrapolative capacity in HTVS, successfully predicting new potential ternary OSC systems and leading donors with PCE values exceeding 20%. Our ML models use HOMO and LUMO inputs for donors, acceptors, and ternaries, facilitating efficient optimization via rapid HTVS of high-performance ternary materials.
AB - Adding a third component to binary organic solar cells (OSCs) enhances ternary OSCs, boosting power conversion efficiency (PCE). However, developing and optimizing appropriate donors, acceptors, and ternary materials remains a complex and demanding task. This study presents four machine-learning (ML) predictive models using XGBoost and ANN approaches, utilizing both experimental and DFT-derived HOMO and LUMO levels for efficient high-throughput virtual screening (HTVS) of top candidates based on PCE. Two distinct latent databases were employed for HTVS: one consisting of 429 413 uniquely recombined ternary OSC systems from experimentally available data, and another comprising ≈2.3 million unique donor molecules from the Harvard Clean Energy Project database (CEPDB). The four ML models demonstrated notable predictive accuracy for PCE on a test dataset containing molecules closely aligned with the training set (interpolation). However, the XGBoost model showed constrained extrapolative ability for molecules significantly divergent from those in the training dataset. In contrast, the ANN models displayed a robust extrapolative capacity in HTVS, successfully predicting new potential ternary OSC systems and leading donors with PCE values exceeding 20%. Our ML models use HOMO and LUMO inputs for donors, acceptors, and ternaries, facilitating efficient optimization via rapid HTVS of high-performance ternary materials.
KW - clean energy
KW - computable molecular descriptor
KW - high throughput virtual screening
KW - predictive machine learning model
KW - ternary organic solar cells
UR - http://www.scopus.com/inward/record.url?scp=85196166742&partnerID=8YFLogxK
U2 - 10.1002/solr.202400287
DO - 10.1002/solr.202400287
M3 - 期刊論文
AN - SCOPUS:85196166742
SN - 2367-198X
VL - 8
JO - Solar RRL
JF - Solar RRL
IS - 13
M1 - 2400287
ER -