Mining Closed Sequential Patterns in Large Sequence Databases

Conference: Recent Trends in Information Processing, Computing, Electrical and Electronics
Author(s): Lokendra Shah, Priyanka Chouhan, N. K.Tiwari Year: 2017
Grenze ID: 02.IPCEE.2017.1.9 Page: 50-55

Abstract

Sequential pattern mining is studied widely in the data mining community. Finding sequential patterns is a basic\ndata mining method with broad applications. Closed sequential pattern mining is an important technique among the different\ntypes of sequential pattern mining, since it preserves the details of the full pattern set and it is more compact than sequential\npattern mining. An important goal of knowledge discovery is the search for patterns in the data that can help explaining its\nunderlying structure. To be practically useful, the discovered patterns should be novel (unexpected) and easy to understand\nby humans. In this thesis, we study the problem of mining patterns (defining subpopulations of data instances) that are\nimportant for predicting and explaining a specific outcome variable. In this paper, we propose an efficient algorithm\nEnhanced CSpan for mining closed sequential patterns. Like CSpan, we uses a pruning method called occurrence checking\nthat allows the early detection of closed sequential patterns during the mining process. Our extensive performance study on\nvarious real and synthetic datasets shows that the proposed algorithm Enhanced CSpan outperforms the CSpan and\npreviously proposed algorithm by an order of magnitude. Our pattern mining method works on complex Large sequence\ndatabase , such as electronic health records, for the event detection task.

<< BACK

IPCEE - 2017