keyboard_arrow_up
Web Log Preprocessing Based on Partial Ancestral Graph Technique for Session Construction

Authors

S. Chitra1 and B. Kalpana2, 1Government Arts College, India and 2Avinashilingam Institute of Home Science and Higher Education for Women, India

Abstract

Web access log analysis is to analyze the patterns of web site usage and the features of users behavior. It is the fact that the normal Log data is very noisy and unclear and it is vital to preprocess the log data for efficient web usage mining process. Preprocessing comprises of three phases which includes data cleaning, user identification and session construction. Session construction is very vital and numerous real world problems can be modeled as traversals on graph and mining from these traversals would provide the requirement for preprocessing phase. On the other hand, the traversals on unweighted graph have been taken into consideration in existing works. This paper oversimplifies this to the case where vertices of graph are given weights to reflect their significance. The proposed method constructs sessions as a Partial Ancestral Graph which contains pages with calculated weights. This will help site administrators to find the interesting pages for users and to redesign their web pages. After weighting each page according to browsing time a PAG structure is constructed for each user session. Existing system in which there is a problem of learning with the latent variables of the data and the problem can be overcome by the proposed method.

Keywords

Web Usage Mining, Partial Ancestral Graph (PAG), Session Construction, Directed Acyclic Graph (DAG), Preprocessing, Robots Cleaning

Full Text  Volume 2, Number 4