Towards high dimensional instance selection: An evolutionary approach

Chih Fong Tsai, Zong Yao Chen

研究成果: 雜誌貢獻期刊論文同行評審

16 引文 斯高帕斯(Scopus)


Data reduction is an important data pre-processing step in the KDD process. It can be approached by the application of some instance selection algorithms to filter out unrepresentative or noisy data from a given (training) dataset. However, the performance of instance selection over very high dimensional data has not yet been fully examined. In this paper, we introduce a novel efficient genetic algorithm (EGA), which fits "biological evolution" into the evolutionary process. In other words, after long-term evolution, individuals find the most efficient way to allocate resources and evolve. The experimental study is based on four very high dimensional datasets ranging from 200 to 18,236 dimensions. In addition, four state-of-the-art algorithms including IB3, DROP3, ICF, and GA are compared with EGA. The experimental results show that EGA allows the k-NN and SVM classifiers to provide the most comparable classification performance with the baseline classifiers without instance selection. Particularly, EGA outperforms the four algorithms in terms of average classification accuracy. Moreover, EGA can produce the largest reduction rates (the same as GA) and it requires relatively less computational time than the other four algorithms.

頁(從 - 到)79-92
期刊Decision Support Systems
出版狀態已出版 - 5月 2014


深入研究「Towards high dimensional instance selection: An evolutionary approach」主題。共同形成了獨特的指紋。