248
Views
0
CrossRef citations to date
0
Altmetric
Articles

Application of the Information Bottleneck method to discover user profiles in a Web store

, &
Pages 98-121 | Published online: 26 Mar 2018
 

ABSTRACT

The paper deals with the problem of discovering groups of Web users with similar behavioral patterns on an e-commerce site. We introduce a novel approach to the unsupervised classification of user sessions, based on session attributes related to the user click-stream behavior, to gain insight into characteristics of various user profiles. The approach uses the agglomerative Information Bottleneck (IB) algorithm. Based on log data for a real online store, efficiency of the approach in terms of its ability to differentiate between buying and non-buying sessions was validated, indicating some possible practical applications of the our method. Experiments performed for a number of session samples showed that the method is capable of separating both types of sessions to a large extent. A detailed analysis was performed for the number of clusters ranging from two to seven, and the results were compared to those achieved by applying the most common clustering algorithm, k-means. Increasing the number of clusters generally leads to better results for both algorithms. However, IB demonstrated much higher average efficiency than k-means for the corresponding number of clusters, and this superiority was especially clear for lower number of clusters. The IB-based division of user sessions into seven clusters gives the mean entropy value of 0.28, which means the 95% separation of sessions of both types. Furthermore, a big advantage of our approach is that it gives a possibility to analyze the probability distribution of session attributes in individual clusters, which allows one to discover hidden knowledge about common characteristics of various user profiles and use this knowledge to support managerial decisions.

Additional information

Notes on contributors

Jacek Iwański

Jacek Iwanski is an assistant professor and the head of the IT section in the Institute of Mathematics and Informatics at the University of Opole, Poland. He received the M.Sc. degree in Physics from the University of Wroclaw, Poland. In 1994 he received the Ph.D. degree in Physics from the University of Hasselt, Belgium. His research interests include artificial intelligence and machine learning methods in real-life applications, embedded systems programming and construction, and sensor networks.

Grażyna Suchacka

Grażyna Suchacka is an assistant professor in the Institute of Mathematics and Informatics at the University of Opole, Poland. She received the M.Sc. degrees in Computer Science and in Management from Wroclaw University of Science and Technology, Poland. In 2011 she received the Ph.D. degree in Computer Science with distinction from Wroclaw University of Science and Technology. Dr. Suchacka’s research interests include analysis and modeling of Web traffic, Web mining, and Quality of Web Service with special regard to electronic commerce support and Web bot recognition.

Grzegorz Chodak

Grzegorz Chodak is an associate professor at the Department of Operations Research, Finance and Applications of Computer Science at the Wroclaw University of Science and Technology, Poland. He is an author and co-author of over 70 scientific publications, mainly in the field of electronic commerce, logistics, data mining, and social media. He specializes in issues related to online stores and the publishing market. He also has a practical experience in electronic commerce.

Log in via your institution

Log in to Taylor & Francis Online

PDF download + Online access

  • 48 hours access to article PDF & online version
  • Article PDF can be downloaded
  • Article PDF can be printed
USD 61.00 Add to cart

Issue Purchase

  • 30 days online access to complete issue
  • Article PDFs can be downloaded
  • Article PDFs can be printed
USD 480.00 Add to cart

* Local tax will be added as applicable

Related Research

People also read lists articles that other readers of this article have read.

Recommended articles lists articles that we recommend and is powered by our AI driven recommendation engine.

Cited by lists all citing articles based on Crossref citations.
Articles with the Crossref icon will open in a new tab.