Toward a Better Understanding of News User Journeys: A Markov Chain Approach

Susan VermeerAmsterdam School of Communication Research, University of Amsterdam, Amsterdam, The NetherlandsCorrespondence[email protected]

https://orcid.org/0000-0002-9829-8057 View further author information

Damian TrillingAmsterdam School of Communication Research, University of Amsterdam, Amsterdam, The Netherlands

https://orcid.org/0000-0002-2586-0352 View further author information

ABSTRACT

In recent years, the volume of clickstream and user data collected by news organizations has reached enormous proportions. As a result, news organizations—as well as journalism scholars—face novel methodological challenges to describe and analyze this wealth of information. To move forward, we demonstrate a computational approach to understand the news journeys Web users take to find the news they want to read. We propose the use of Markov chains. These models provide an effective and compact way to discover meaningful patterns in clickstream data. In particular, they capture the sequentiality in news use patterns. We illustrate this approach with an analysis of more than 1 million Web pages, from 175 websites (news websites, search engines, social media), collected over 8 months in 2017/18. The analysis of such data is of high interest to journalism scholars, but can also help news organizations to design sales strategies, provide more personalized content, and find the most effective structure for their website.

KEYWORDS:

Disclosure Statement

No potential conflict of interest was reported by the author(s).

ORCID

Susan Vermeer http://orcid.org/0000-0002-9829-8057

Damian Trilling http://orcid.org/0000-0002-2586-0352

Notes

1 More information on the Python module see https://github.com/uvacw/df2markov.

2 Besides tracking their online media use, respondents also filled out an online survey: 48.5% were male, mean age was 47.2 (SD = 19.2), and 15.7% had a low level of education (e.g., primary school), 38.3% had a medium level of education (e.g., college), and 44.6% had a high level of education (e.g., university).

3 To guarantee respondents’ privacy as much as possible, we filtered the raw data to exclude sensitive information. We stored the data in an Elasticsearch database on a server that is not directly available for the researchers. Instead, Robout, a Python library is made available on another secured server to complement Robin. We conducted the analyses using Robout and a Elasticsearch database on the second server so no sensitive data would leave the environment.

4 Examining the probability of users changing from one website to another website (e.g., social media → tabloid → tabloid → broadsheet) or the probability of users changing from one Web page to another Web page within the same website (e.g., homepage → section page → news article → news article).

5 More information see https://doi.org/10.6084/m9.figshare.7314896.v1.

6 We are grateful to an anonymous reviewer for their suggestion.

Additional information

Funding

The research was supported by the Research Priority Area “Personalised Communication” of the University of Amsterdam. The work was carried out on the Dutch national e-infrastructure with the support of SURF Foundation.

Toward a Better Understanding of News User Journeys: A Markov Chain Approach

Information for

Open access

Opportunities

Help and information

Toward a Better Understanding of News User Journeys: A Markov Chain Approach

ABSTRACT

Disclosure Statement

ORCID

Notes

Additional information

Funding

Related research

To cite this article:

Download citation

Information for

Open access

Opportunities

Help and information

Keep up to date

Your download is now in progress and you may close this window

Login or register to access this feature