TY - JOUR
T1 - On mining multi-time-interval sequential patterns
AU - Hu, Ya Han
AU - Huang, Tony Cheng Kui
AU - Yang, Hui Ru
AU - Chen, Yen Liang
N1 - Funding Information:
The authors would like to thank the Editor and anonymous reviewers for their valuable comments to improve this paper. This research was supported by the National Science Council of the Republic of China under the Grants NSC 97-2410-H-309-020.
PY - 2009/10
Y1 - 2009/10
N2 - Sequential pattern mining is essential in many applications, including computational biology, consumer behavior analysis, web log analysis, etc. Although sequential patterns can tell us what items are frequently to be purchased together and in what order, they cannot provide information about the time span between items for decision support. Previous studies dealing with this problem either set time constraints to restrict the patterns discovered or define time-intervals between two successive items to provide time information. Accordingly, the first approach falls short in providing clear time-interval information while the second cannot discover time-interval information between two non-successive items in a sequential pattern. To provide more time-related knowledge, we define a new variant of time-interval sequential patterns, called multi-time-interval sequential patterns, which can reveal the time-intervals between all pairs of items in a pattern. Accordingly, we develop two efficient algorithms, called the MI-Apriori and MI-PrefixSpan algorithms, to solve this problem. The experimental results show that the MI-PrefixSpan algorithm is faster than the MI-Apriori algorithm, but the MI-Apriori algorithm has better scalability in long sequence data.
AB - Sequential pattern mining is essential in many applications, including computational biology, consumer behavior analysis, web log analysis, etc. Although sequential patterns can tell us what items are frequently to be purchased together and in what order, they cannot provide information about the time span between items for decision support. Previous studies dealing with this problem either set time constraints to restrict the patterns discovered or define time-intervals between two successive items to provide time information. Accordingly, the first approach falls short in providing clear time-interval information while the second cannot discover time-interval information between two non-successive items in a sequential pattern. To provide more time-related knowledge, we define a new variant of time-interval sequential patterns, called multi-time-interval sequential patterns, which can reveal the time-intervals between all pairs of items in a pattern. Accordingly, we develop two efficient algorithms, called the MI-Apriori and MI-PrefixSpan algorithms, to solve this problem. The experimental results show that the MI-PrefixSpan algorithm is faster than the MI-Apriori algorithm, but the MI-Apriori algorithm has better scalability in long sequence data.
KW - Data mining
KW - Knowledge discovery
KW - Multi-time-interval
KW - Sequential pattern
KW - Time-interval
UR - http://www.scopus.com/inward/record.url?scp=69349098637&partnerID=8YFLogxK
U2 - 10.1016/j.datak.2009.05.003
DO - 10.1016/j.datak.2009.05.003
M3 - 期刊論文
AN - SCOPUS:69349098637
VL - 68
SP - 1112
EP - 1127
JO - Data and Knowledge Engineering
JF - Data and Knowledge Engineering
SN - 0169-023X
IS - 10
ER -