Feature selection in high dimension, low sample size (HDLSS) data is always an important data pre-processing task. In the literature, the concept of ensemble learning has been applied to improve single feature selection methods, the so-called ensemble feature selection techniques. The most widely used approach is to combine multiple feature selection methods and their selection results via some sort of aggregation function in a parallel manner. Another ensemble strategy is based on the serial combination approach where the selection results of the first feature selection stage are used as input for the second stage of feature selection to produce the final output. The aim of this paper is to fully explore the performance of parallel and serial combination approaches for ensemble feature selection over HDLSS data. In particular, we strive to answer two research questions: whether parallel and serial based ensemble feature selection can outperform single feature selection and which combination approach is the better choice for ensemble feature selection. The experimental results based on comparing nine parallel and nine serial combinations, as well as three single baseline feature selection methods, including principal component analysis (PCA), genetic algorithm (GA), and C4.5 decision tree, show that ensemble feature selection performs better than single feature selection in terms of classification accuracy. However, there are no significant differences in performance between the single best baseline method (i.e. GA) and the top three parallel and serial combinations. On the other hand, the serial combination approach produces the largest feature reduction rate.
- Data mining
- Ensemble learning
- Feature selection
- High dimension low sample size
- Machine learning