On Mining Web Access Logs
Loading...
Files
Permanent Link
Author/Creator
Author/Creator ORCID
Date
1999-10-24
Type of Work
Department
Program
Citation of Original Publication
Rights
This item is likely protected under Title 17 of the U.S. Copyright Law. Unless on a Creative Commons license, for uses protected by Copyright Law, contact the copyright holder or the author.
Abstract
The proliferation of information on the world wide web has made the personalization of this information
space a necessity. One possible approach to web personalization is to mine typical user profiles
from the vast amount of historical data stored in access logs. In the absence of any a priori knowledge,
unsupervised classification or clustering methods seem to be ideally suited to analyze the semi-structured
log data of user accesses. In this paper, we define the notion of a “user session”, as well as a dissimilarity
measure between two web sessions that captures the organization of a web site. To extract a user access
profile, we cluster the user sessions based on the pair-wise dissimilarities using a robust fuzzy clustering
algorithm that we have developed. We report the results of experiments with our algorithm and show that
this leads to extraction of interesting user profiles. We also show that it outperforms association rule based
approaches for this task.