Categorical data visualization and clustering using subjective factors

Chia Hui Chang, Zhi Kai Ding

Research output: Contribution to journalArticlepeer-review

24 Scopus citations

Abstract

Clustering is an important data mining problem. However, most earlier work on clustering focused on numeric attributes which have a natural ordering to their attribute values. Recently, clustering data with categorical attributes, whose attribute values do not have a natural ordering, has received more attention. A common issue in cluster analysis is that there is no single correct answer to the number of clusters, since cluster analysis involves human subjective judgement. Interactive visualization is one of the methods where users can decide a proper clustering parameters. In this paper, a new clustering approach called CDCS (Categorical Data Clustering with Subjective factors) is introduced, where a visualization tool for clustered categorical data is developed such that the result of adjusting parameters is instantly reflected. The experiment shows that CDCS generates high quality clusters compared to other typical algorithms.

Original languageEnglish
Pages (from-to)243-262
Number of pages20
JournalData and Knowledge Engineering
Volume53
Issue number3
DOIs
StatePublished - Jun 2005

Keywords

  • Categorical data
  • Cluster analysis
  • Cluster visualization
  • Data mining

Fingerprint

Dive into the research topics of 'Categorical data visualization and clustering using subjective factors'. Together they form a unique fingerprint.

Cite this