The data sampling effect on financial distress prediction by single and ensemble learning techniques

研究成果: 雜誌貢獻期刊論文同行評審

1 引文 斯高帕斯(Scopus)

摘要

Financial distress domain problem datasets are usually class imbalanced. In literature, data sampling is one of the widely used solutions to deal with the class imbalance problem. This article focuses on examining the data sampling effect on financial distress prediction models by single and ensemble learning techniques. The experimental datasets are based on three bankruptcy prediction and credit scoring datasets and twelve different single classifiers and classifier ensembles are constructed. We find that although some prediction models trained by the original class imbalanced datasets provide reasonable AUC, their type II errors are very high for the practical usage. However, when data sampling is performed over the datasets, all of the prediction models can slightly increase their AUC and largely reduce their type II errors. More specifically, the decision tree ensembles by bagging and boosting methods are the better choices for financial distress prediction.

原文???core.languages.en_GB???
頁(從 - 到)4344-4355
頁數12
期刊Communications in Statistics - Theory and Methods
52
發行號12
DOIs
出版狀態已出版 - 2023

指紋

深入研究「The data sampling effect on financial distress prediction by single and ensemble learning techniques」主題。共同形成了獨特的指紋。

引用此