TY - JOUR
T1 - Co-clustering with augmented matrix
AU - Wu, Meng Lun
AU - Chang, Chia Hui
AU - Liu, Rui Zhe
N1 - Funding Information:
This paper is partially supported by National Science Council, Taiwan under grant NSC-100-2628-E-8-012-MY3.
PY - 2013/7
Y1 - 2013/7
N2 - Clustering plays an important role in data mining as many applications use it as a preprocessing step for data analysis. Traditional clustering focuses on the grouping of similar objects, while two-way co-clustering can group dyadic data (objects as well as their attributes) simultaneously. Most co-clustering research focuses on single correlation data, but there might be other possible descriptions of dyadic data that could improve co-clustering performance. In this research, we extend ITCC (Information Theoretic Co-Clustering) to the problem of co-clustering with augmented matrix. We proposed CCAM (Co-Clustering with Augmented Matrix) to include this augmented data for better co-clustering. We apply CCAM in the analysis of on-line advertising, where both ads and users must be clustered. The key data that connect ads and users are the user-ad link matrix, which identifies the ads that each user has linked; both ads and users also have their feature data, i.e. the augmented matrix. To evaluate the proposed method, we use two measures: classification accuracy and K-L divergence. The experiment is done using the advertisements and user data from Morgenstern, a financial social website that focuses on the advertisement agency. The experiment results show that CCAM provides better performance than ITCC since it considers the use of augmented matrix during clustering.
AB - Clustering plays an important role in data mining as many applications use it as a preprocessing step for data analysis. Traditional clustering focuses on the grouping of similar objects, while two-way co-clustering can group dyadic data (objects as well as their attributes) simultaneously. Most co-clustering research focuses on single correlation data, but there might be other possible descriptions of dyadic data that could improve co-clustering performance. In this research, we extend ITCC (Information Theoretic Co-Clustering) to the problem of co-clustering with augmented matrix. We proposed CCAM (Co-Clustering with Augmented Matrix) to include this augmented data for better co-clustering. We apply CCAM in the analysis of on-line advertising, where both ads and users must be clustered. The key data that connect ads and users are the user-ad link matrix, which identifies the ads that each user has linked; both ads and users also have their feature data, i.e. the augmented matrix. To evaluate the proposed method, we use two measures: classification accuracy and K-L divergence. The experiment is done using the advertisements and user data from Morgenstern, a financial social website that focuses on the advertisement agency. The experiment results show that CCAM provides better performance than ITCC since it considers the use of augmented matrix during clustering.
KW - Classification evaluation
KW - Co-clustering
KW - Collaborative filtering
UR - http://www.scopus.com/inward/record.url?scp=84878838290&partnerID=8YFLogxK
U2 - 10.1007/s10489-012-0401-9
DO - 10.1007/s10489-012-0401-9
M3 - 期刊論文
AN - SCOPUS:84878838290
SN - 0924-669X
VL - 39
SP - 153
EP - 164
JO - Applied Intelligence
JF - Applied Intelligence
IS - 1
ER -