keyboard_arrow_up
Improvement of a Method Based on Hidden Markov Model for Clustering Web Users

Authors

Sadegh Khanpour and Omid sojoodi, Qazvin Azad University, Iran

Abstract

Nowadays the determination of the dynamics of sequential data, such as marketing, finance, social sciences or web research has receives much attention from researchers and scholars. Clustering of such data by nature is always a more challenging task. This paper investigates the applications of different Markov models in web mining and improves a developed method for clustering web users, using hidden Markov models. In the first step, the categorical sequences are transformed into a probabilistic space by hidden Markov model. Then, in the second step, hierarchical clustering, the performance of clustering process is evaluated with various distances criteria. Furthermore this paper shows implementation of the proposed improvements with symmetric distance measure as Total-Variance and Mahalanobis compared with the previous use of the proposed method (such as Kullback–Leibler) on the well-known Microsoft dataset with website user search patterns is more clearly result in separate clusters.

Keywords

Hidden Markov Model, distance metric, agglomerative clustering, categorical time series sequence, probability model

Full Text  Volume 6, Number 2