Abstract
Clustering plays an important role in data mining, as it is used by many applications as a preprocessing step for data analysis. Traditional clustering focuses on grouping similar objects, while two-way co-clustering can group dyadic data (objects as well as their attributes) simultaneously. In this research, we apply two-way co-clustering to the analysis of online advertising where both ads and users need to be clustered. However, in addition to the ad-user link matrix that denotes the ads which a user has linked, we also have two additional matrices, which represent extra information about users and ads. In this paper, we proposed a 3-staged clustering method that makes use of the three data matrices to enhance clustering performance. In addition, an Iterative Cross Co-Clustering (ICCC) algorithm is also proposed for two-way co-clustering. The experiment is performed using the advertisement and user data from Morgenstern, a financial social website that focuses on the agency of advertisements. The result shows that iterative cross co-clustering provides better performance than traditional clustering and completes the task more efficiently.
Original language | English |
---|---|
Pages (from-to) | 83-97 |
Number of pages | 15 |
Journal | Journal of Information Science and Engineering |
Volume | 28 |
Issue number | 1 |
State | Published - Jan 2012 |
Keywords
- Clustering evaluation
- Co-clustering
- Decision tree
- Dyadic data analysis
- KL divergence