Instance selection by genetic-based biological algorithm

Zong Yao Chen, Chih Fong Tsai, William Eberle, Wei Chao Lin, Shih Wen Ke

Research output: Contribution to journalArticlepeer-review

10 Scopus citations

Abstract

Instance selection is an important research problem of data pre-processing in the data mining field. The aim of instance selection is to reduce the data size by filtering out noisy data, which may degrade the mining performance, from a given dataset. Genetic algorithms have presented an effective instance selection approach to improve the performance of data mining algorithms. However, current approaches only pursue the simplest evolutionary process based on the most reasonable and simplest rules. In this paper, we introduce a novel instance selection algorithm, namely a genetic-based biological algorithm (GBA). GBA fits a “biological evolution” into the evolutionary process, where the most streamlined process also complies with the reasonable rules. In other words, after long-term evolution, organisms find the most efficient way to allocate resources and evolve. Consequently, we can closely simulate the natural evolution of an algorithm, such that the algorithm will be both efficient and effective. Our experiments are based on comparing GBA with five state-of-the-art algorithms over 50 different domain datasets from the UCI Machine Learning Repository. The experimental results demonstrate that GBA outperforms these baselines, providing the lowest classification error rate and the least storage requirement. Moreover, GBA is very computational efficient, which only requires slightly larger computational cost than GA.

Original languageEnglish
Pages (from-to)1269-1282
Number of pages14
JournalSoft Computing
Volume19
Issue number5
DOIs
StatePublished - May 2015

Keywords

  • Biological evolution
  • Data mining
  • Data reduction
  • Genetic algorithms
  • Instance selection
  • Machine learning

Fingerprint

Dive into the research topics of 'Instance selection by genetic-based biological algorithm'. Together they form a unique fingerprint.

Cite this