American Journal of Software Engineering. 2014, 2(2), 16-21DOI:
Abstract: Sequential pattern mining is an important data mining technique which discovers closed frequent sub sequence from a sequence database. Sequential pattern mining was used in a great spectrum of areas. Some of the applications of sequential pattern mining are namely bio-informatics, web access traces, system utilization logs etc. The data is naturally in the form of sequences. However it is very difficult as it generates explosive number of sub sequence in candidate generator and test approach. Previous sequential pattern mining algorithm like Clospan, Sequence generator, closed sequence-sequence generator mining (CSGM). In sequential pattern mining and web log mining a traditional algorithm Apriori is always reminded but due to some performance issues they were replaced with other algorithms and techniques. Many different techniques for mining frequent sequential patterns from the log data have been proposed in the recent past but still mining data from weblog files an effective and efficient algorithm is required that works with high performance. Moreover; it is required to authenticate the algorithm for that purposes we have used a traditional algorithm for mining sequential pattern from web log data. Thus the aim of the present work is to bridge these gaps by developing and proposing a new algorithm “Sequential ID3” for sequential pattern mining and their experimental validation on web log data.