Missing value imputation and the effect of feature normalisation on financial distress prediction

Research output: Contribution to journalArticlepeer-review

2 Scopus citations

Abstract

In this paper, we focus on comparing the imputation performance of different deep and machine learning techniques on nine related datasets containing different missing rates ranging from 10% to 50%. Moreover, since each feature value ranges differently, such as total liability/equity ratio and earnings per share, the effect of feature normalisation on the imputation results is also examined to see whether normalising the feature values after missing value imputation can improve the prediction model performance. The experimental results show that the deep neural network technique does not necessarily perform better than the traditional machine learning technique for missing value imputation. In particular, the random forest imputation model performs the best, whereas the k-nearest neighbour method is the second best imputation model in terms of the AUC rates and type II errors. The performance in improvement of prediction models after performing feature normalisation is heavily dependent on the chosen classification technique. There is thus no need to consider the normalisation step when the random forest classifier is used. It is found that the deep neural network and support vector machine classifiers can significantly outperform those without feature normalisation.

Keywords

  • Machine learning
  • deep learning
  • feature normalisation
  • financial distress prediction
  • missing value imputation

Fingerprint

Dive into the research topics of 'Missing value imputation and the effect of feature normalisation on financial distress prediction'. Together they form a unique fingerprint.

Cite this