Perfect hashing schemes for mining traversal patterns

Chin Chen Chang, Chih Yang Lin, Henry Chou

Research output: Contribution to journalReview articlepeer-review

6 Scopus citations

Abstract

Hashing schemes are a common technique to improve the performance in mining not only association rules but also sequential patterns or traversal patters. However, the collision problem in hash schemes may result in severe performance degradation. In this paper, we propose perfect hashing schemes for mining traversal patterns to avoid collisions in the hash table. The main idea is to transform each large itemsets into one large 2-itemset by employing a delicate encoding scheme. Then perfect hash schemes designed only for itemsets of length two, rather than varied lengths, are applied. The experimental results show that our method is more than twice as faster than FS algorithm. The results also show our method is scalable to database sizes. One variant of our perfect hash scheme, called partial hash, is proposed to cope with the enormous memory space required by typical perfect hash functions. We also give a comparison of the performances of different perfect hash variants and investigate their properties.

Original languageEnglish
Pages (from-to)185-202
Number of pages18
JournalFundamenta Informaticae
Volume70
Issue number3
StatePublished - 2006

Keywords

  • Data mining
  • Perfect hashing
  • Performance analysis
  • Traversal patterns

Fingerprint

Dive into the research topics of 'Perfect hashing schemes for mining traversal patterns'. Together they form a unique fingerprint.

Cite this