A sampling-based method for mining frequent patterns from databases

Yen Liang Chen, Chin Yuan Ho

Research output: Contribution to journalConference articlepeer-review

6 Scopus citations

Abstract

Mining frequent item sets (frequent patterns) in transaction databases is a well known problem in data mining research. This work proposes a sampling-based method to find frequent patterns. The proposed method contains three phases. In the first phase, we draw a small sample of data to estimate the set of frequent patterns, denoted as FS. The second phase computes the actual supports of the patterns in FS as well as identifies a subset of patterns in FS that need to be further examined in the next phase. Finally, the third phase explores this set and finds all missing frequent patterns. The empirical results show that our algorithm is efficient, about two or three times faster than the well-known FP-growth algorithm.

Original languageEnglish
Pages (from-to)536-545
Number of pages10
JournalLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volume3614
Issue numberPART II
StatePublished - 2005
EventSecond International Confernce on Fuzzy Systems and Knowledge Discovery, FSKD 2005 - Changsha, China
Duration: 27 Aug 200529 Aug 2005

Fingerprint

Dive into the research topics of 'A sampling-based method for mining frequent patterns from databases'. Together they form a unique fingerprint.

Cite this