Early versus late dimensionality reduction of bag-of-words feature representation for image classification

Chih Fong Tsai, Ya Han Hu, Wei Chao Lin, Ming Chang Wang

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

Abstract

Extracting the bag-of-words (BoW) feature from images has been widely used for image classification. In general, some local keypoints are first of all detected from each image and the keypoint descriptor, such as scale-invariant feature transform (SIFT), is extracted. Then, the keypoint descriptors of a given image dataset are tokenized (or clustered) to generate a visual-word vocabulary (or codebook). Next, the visual-word vector of an image contains the presence or absence information of each visual word in the image, e.g. the number of keypoints in the corresponding cluster, i.e. visual word. Consequently, images are represented by a histogram over visual words. Since the dimensionalities of the SIFT keypoint descriptor and the final BoW feature for image classification are certainly high, this paper aims at examining the effect of performing dimensionality reduction (DR) for both different features on classification accuracy. In particular, early DR is used over the SIFT descriptor and late DR for the BoW feature. The experimental results based on Caltech 101 (2-D images) and ESB (3-D images) datasets show that reducing 50% dimensionality of the SIFT descriptor by PCA can allow the SVM classifier to perform similar to the one without DR. On the other hand, late DR only works for 2-D images, but the classification performance of SVM cannot be kept if over 25% dimensionality of the BoW feature is reduced.

Original languageEnglish
Title of host publicationProceedings of 2017 International Conference on Bioinformatics Research and Applications, ICBRA 2017
PublisherAssociation for Computing Machinery
Pages42-45
Number of pages4
ISBN (Electronic)9781450353823
DOIs
StatePublished - 8 Dec 2017
Event2017 International Conference on Bioinformatics Research and Applications, ICBRA 2017 - Barcelona, Spain
Duration: 8 Dec 201710 Dec 2017

Publication series

NameACM International Conference Proceeding Series

Conference

Conference2017 International Conference on Bioinformatics Research and Applications, ICBRA 2017
Country/TerritorySpain
CityBarcelona
Period8/12/1710/12/17

Keywords

  • Bag-of-words
  • Dimensionality reduction
  • Feature selection
  • Image classification
  • Principal component analysis

Fingerprint

Dive into the research topics of 'Early versus late dimensionality reduction of bag-of-words feature representation for image classification'. Together they form a unique fingerprint.

Cite this