TY - JOUR

T1 - Discovering fuzzy time-interval sequential patterns in sequence databases

AU - Chen, Yen Liang

AU - Huang, Tony Cheng Kui

N1 - Funding Information:
Manuscript received January 12, 2004; revised May 25, 2004 and December 24, 2004. This work was supported in part by the Ministry of Education (MOE) Program for Promoting Academic Excellence of Universities under Grant 91-H-FA07-1-4. This paper was recommended by Associate Editor D. Nauck.

PY - 2005/10

Y1 - 2005/10

N2 - Given a sequence database and minimum support threshold, the task of sequential pattern mining is to discover the complete set of sequential patterns in databases. From the discovered sequential patterns, we can know what items are frequently brought together and in what order they appear. However, they cannot tell us the time gaps between successive items in patterns. Accordingly, Chen et al. have proposed a generalization of sequential patterns, called time-interval sequential patterns, which reveals not only the order of items, but also the time intervals between successive items [9]. An example of time-interval sequential pattern has a form like (A, I2, B, I1, C), meaning that we buy A first, then after an interval of I2 we buy B, and finally after an interval of I1 we buy C, where I2 and I1 are predetermined time ranges. Although this new type of pattern can alleviate the above concern, it causes the sharp boundary problem. That is, when a time interval is near the boundary of two predetermined time ranges, we either ignore or overemphasize it. Therefore, this paper uses the concept of fuzzy sets to extend the original research so that fuzzy time-interval sequential patterns are discovered from databases. Two efficient algorithms, the fuzzy time interval (FTI)-Apriori algorithm and the FTI-PrefixSpan algorithm, are developed for mining fuzzy time-interval sequential patterns. In our simulation results, we find that the second algorithm outperforms the first one, not only in computing time but also in scalability with respect to various parameters.

AB - Given a sequence database and minimum support threshold, the task of sequential pattern mining is to discover the complete set of sequential patterns in databases. From the discovered sequential patterns, we can know what items are frequently brought together and in what order they appear. However, they cannot tell us the time gaps between successive items in patterns. Accordingly, Chen et al. have proposed a generalization of sequential patterns, called time-interval sequential patterns, which reveals not only the order of items, but also the time intervals between successive items [9]. An example of time-interval sequential pattern has a form like (A, I2, B, I1, C), meaning that we buy A first, then after an interval of I2 we buy B, and finally after an interval of I1 we buy C, where I2 and I1 are predetermined time ranges. Although this new type of pattern can alleviate the above concern, it causes the sharp boundary problem. That is, when a time interval is near the boundary of two predetermined time ranges, we either ignore or overemphasize it. Therefore, this paper uses the concept of fuzzy sets to extend the original research so that fuzzy time-interval sequential patterns are discovered from databases. Two efficient algorithms, the fuzzy time interval (FTI)-Apriori algorithm and the FTI-PrefixSpan algorithm, are developed for mining fuzzy time-interval sequential patterns. In our simulation results, we find that the second algorithm outperforms the first one, not only in computing time but also in scalability with respect to various parameters.

KW - Data mining

KW - Fuzzy sets

KW - Sequence data

KW - Sequential patterns

KW - Time interval

UR - http://www.scopus.com/inward/record.url?scp=26844495651&partnerID=8YFLogxK

U2 - 10.1109/TSMCB.2005.847741

DO - 10.1109/TSMCB.2005.847741

M3 - 期刊論文

C2 - 16240771

AN - SCOPUS:26844495651

SN - 1083-4419

VL - 35

SP - 959

EP - 972

JO - IEEE Transactions on Systems, Man, and Cybernetics, Part B: Cybernetics

JF - IEEE Transactions on Systems, Man, and Cybernetics, Part B: Cybernetics

IS - 5

ER -