5,940
Views
10
CrossRef citations to date
0
Altmetric
Articles

Toward a Better Understanding of News User Journeys: A Markov Chain Approach

ORCID Icon & ORCID Icon
 

ABSTRACT

In recent years, the volume of clickstream and user data collected by news organizations has reached enormous proportions. As a result, news organizations—as well as journalism scholars—face novel methodological challenges to describe and analyze this wealth of information. To move forward, we demonstrate a computational approach to understand the news journeys Web users take to find the news they want to read. We propose the use of Markov chains. These models provide an effective and compact way to discover meaningful patterns in clickstream data. In particular, they capture the sequentiality in news use patterns. We illustrate this approach with an analysis of more than 1 million Web pages, from 175 websites (news websites, search engines, social media), collected over 8 months in 2017/18. The analysis of such data is of high interest to journalism scholars, but can also help news organizations to design sales strategies, provide more personalized content, and find the most effective structure for their website.

Disclosure Statement

No potential conflict of interest was reported by the author(s).

Notes

1 More information on the Python module see https://github.com/uvacw/df2markov.

2 Besides tracking their online media use, respondents also filled out an online survey: 48.5% were male, mean age was 47.2 (SD = 19.2), and 15.7% had a low level of education (e.g., primary school), 38.3% had a medium level of education (e.g., college), and 44.6% had a high level of education (e.g., university).

3 To guarantee respondents’ privacy as much as possible, we filtered the raw data to exclude sensitive information. We stored the data in an Elasticsearch database on a server that is not directly available for the researchers. Instead, Robout, a Python library is made available on another secured server to complement Robin. We conducted the analyses using Robout and a Elasticsearch database on the second server so no sensitive data would leave the environment.

4 Examining the probability of users changing from one website to another website (e.g., social media  tabloid  tabloid  broadsheet) or the probability of users changing from one Web page to another Web page within the same website (e.g., homepage  section page  news article  news article).

6 We are grateful to an anonymous reviewer for their suggestion.

Additional information

Funding

The research was supported by the Research Priority Area “Personalised Communication” of the University of Amsterdam. The work was carried out on the Dutch national e-infrastructure with the support of SURF Foundation.