Co-clustering with augmented data matrix

Meng Lun Wu, Chia Hui Chang, Rui Zhe Liu

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

3 Scopus citations

Abstract

Clustering plays an important role in data mining as many applications use it as a preprocessing step for data analysis. Traditional clustering focuses on the grouping of similar objects, while two-way co-clustering can group dyadic data (objects as well as their attributes) simultaneously. Most co-clustering research focuses on single correlation data, but there might be other possible descriptions of dyadic data that could improve co-clustering performance. In this research, we extend ITCC (Information Theoretic Co-Clustering) to the problem of co-clustering with augmented matrix. We proposed CCAM (Co-Clustering with Augmented Data Matrix) to include this augmented data for better co-clustering. We apply CCAM in the analysis of on-line advertising, where both ads and users must be clustered. The key data that connect ads and users are the user-ad link matrix, which identifies the ads that each user has linked; both ads and users also have their feature data, i.e. the augmented data matrix. To evaluate the proposed method, we use two measures: classification accuracy and K-L divergence. The experiment is done using the advertisements and user data from Morgenstern, a financial social website that focuses on the advertisement agency. The experiment results show that CCAM provides better performance than ITCC since it consider the use of augmented data during clustering.

Original languageEnglish
Title of host publicationData Warehousing and Knowledge Discovery - 13th International Conference, DaWaK 2011, Proceedings
Pages289-300
Number of pages12
DOIs
StatePublished - 2011
Event13th International Conference on Data Warehousing and Knowledge Discovery, DaWaK 2011 - Toulouse, France
Duration: 29 Aug 20112 Sep 2011

Publication series

NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volume6862 LNCS
ISSN (Print)0302-9743
ISSN (Electronic)1611-3349

Conference

Conference13th International Conference on Data Warehousing and Knowledge Discovery, DaWaK 2011
Country/TerritoryFrance
CityToulouse
Period29/08/112/09/11

Fingerprint

Dive into the research topics of 'Co-clustering with augmented data matrix'. Together they form a unique fingerprint.

Cite this