Kernel dependency estimation (KDE) is a learning framework of finding the dependencies between two general classes of objects. Although it has been successfully used for many types of applications, its properties are not fully studied. In this paper, we discuss two practical issues with KDE. The first one is its real-value output for each label, which differ from the desired binary value for the 1-of-k coding scheme. Thus, a gap usually exists between the predicted real-value from KDE and the ground truth binary value. One common practice to reduce the gap is using thresholding strategies. In this paper, we provide an alternative approach that combines a second-level classifier using a special degenerated form of stacked generalization. The second issue is the decreasing performance when KDE is applied to classification with skewed data. Our experiments show that standard KDE is not an appropriate approach for skewed data; we then provide a solution to handle skewed data.
- Kernel dependency estimation
- Skewed data