TY - JOUR
T1 - SME default prediction framework with the effective use of external public credit data
AU - Luo, Zhichao
AU - Hsu, Pingyu
AU - Xu, Ni
N1 - Publisher Copyright:
© 2020 by the authors.
PY - 2020/9
Y1 - 2020/9
N2 - Traditional default prediction models mainly rely on financial data. However, financial data on small and medium-sized enterprises (SMEs) are difficult to obtain, and even when they are available, their opaqueness may hinder analysis. Therefore, traditional prediction models encounter serious problems when being utilized to predict the defaulting of SMEs. In this paper, a novel prediction framework utilizing only external public credit data is proposed. The external public credit data used include SMEs' basic information (BI), credit information from the government (CIG), and court verdict information (CVI), which can be collected from publicly accessible websites. Records on 15,605 sample companies were collected from approximately 300,000 companies. Among them, 8183 have defaulted. The empirical data were applied to construct prediction models using logistic regression, the classification and regression tree (CART) model, and LightGBM. The best results achieved 0.87 accuracy and 0.92 area under receiver operating characteristic (AUC). The results show that the model only uses the external credit data proven to have significant predict ability, and CIG variables offer the best prediction capacities.
AB - Traditional default prediction models mainly rely on financial data. However, financial data on small and medium-sized enterprises (SMEs) are difficult to obtain, and even when they are available, their opaqueness may hinder analysis. Therefore, traditional prediction models encounter serious problems when being utilized to predict the defaulting of SMEs. In this paper, a novel prediction framework utilizing only external public credit data is proposed. The external public credit data used include SMEs' basic information (BI), credit information from the government (CIG), and court verdict information (CVI), which can be collected from publicly accessible websites. Records on 15,605 sample companies were collected from approximately 300,000 companies. Among them, 8183 have defaulted. The empirical data were applied to construct prediction models using logistic regression, the classification and regression tree (CART) model, and LightGBM. The best results achieved 0.87 accuracy and 0.92 area under receiver operating characteristic (AUC). The results show that the model only uses the external credit data proven to have significant predict ability, and CIG variables offer the best prediction capacities.
KW - Credit risk
KW - Default prediction
KW - External credit data
KW - Small and medium-sized enterprises (SMEs)
UR - http://www.scopus.com/inward/record.url?scp=85091382650&partnerID=8YFLogxK
U2 - 10.3390/su12187575
DO - 10.3390/su12187575
M3 - 期刊論文
AN - SCOPUS:85091382650
SN - 2071-1050
VL - 12
JO - Sustainability (Switzerland)
JF - Sustainability (Switzerland)
IS - 18
M1 - 7575
ER -