Attribute-oriented induction (AOI) is a data analysis technique based on induction. The traditional AOI algorithm requires a threshold given by users to determine the number of output tuples. However, it is not easy to set an appropriate tuple threshold, and there is usually noise contained in a dataset. The traditional AOI algorithm can only generate a summary output of a fixed size, but it cannot guarantee that all generalized tuples have sufficient specificity and representativeness. In this article, a new AOI method is proposed to make up for the shortcomings. We introduce the concept of cost to measure the loss of accuracy due to attribute ascension. We also propose two algorithms based on the hierarchical clustering method. By setting cost constraints on each generalized tuple, our method can generate accurate output while eliminating noise, and help users get more informative and clearer results.
|出版狀態||已出版 - 5月 2021|