493
Views
2
CrossRef citations to date
0
Altmetric
Articles

Talkographics: Measuring TV and Brand Audience Demographics and Interests from User-Generated Content

 

ABSTRACT

A wide variety of consumer data are available, including data that consumers make available online to the public for free. Because online users talk a lot about TV shows and brands online, these messages can be used to measure TV and brand audiences and help marketers answer questions about who is talking, what they are saying, and where to find them. However, the demographics of online users typically need to be inferred. We propose the use of “Talkographics,” which consists of combining publicly available text data from Twitter and data from the Facebook ads platform and the IMDb database in a novel way that enables group-level prediction of the demographics and interests of a large number of audiences. In addition, we demonstrate that group-level predictions can be used reliably in the context of building affinity networks and propose a recommender system using these Talkographic profiles.

Notes

2. Data were collected in August and September 2012.

3. This is a web interface that allows developers to pull publicly available data for Twitter users: https://developer. twitter.com/en/docs.html. A developer can use these APIs to collect both network and tweet data.

4. Fewer than 0.1% of users have more than 2,000 followers, according to Kwak et al. [Citation47].

5. This corresponds to two Twitter REST API calls per user. Due to time and API constraints, we were unable to collect more tweets for the full set of users.

6. The Facebook terms of service state that these data can be reported at an aggregate level.

7. We evaluated statistical significance by paired t tests for precision at {1, 5, 10, 20, 50, and 100}. The largest p value we observed for all these tests was p = 3.4 × 10 (TV network vs. Talkographic profile for precision at 1). We used the scipy.stats.ttest rel method from the scipy Python library to perform these tests (scipy version 1.1.0, Python 3.5.5).

8. Network collection lasted from August 3, 2017, to August 25, 2017, due to Twitter API rate limiting.

9. The account username had more than zero followers and was not deactivated or made private.

10. Both mean and median recommended show rank for Talkographic and product network profiles are less than the rank of a randomly shuffled ranking at p < 0.001, according to a bootstrap sampling test of 1,000 samples with 10,000 (user, show) pairs per sample.

11. Because of this, we believe that the quality of Talkographic profiles would be improved by detecting and excluding fake accounts or bots from the brand audience.

Additional information

Notes on contributors

Shawndra Hill

SHAWNDRA HILL ([email protected]) is a senior researcher in the Computational Social Science Group at Microsoft Research in New York City. She was previously on the faculty of the Operations and Information Management Department at the Wharton School of the University of Pennsylvania. Dr. Hill researches the value to companies of mining data on how consumers interact with each other on online platforms—for targeted marketing, advertising, health, and fraud detection purposes. Her current research focuses on the interactions between TV content and online behavior.

Adrian Benton

ADRIAN BENTON ([email protected]) is a final-year Ph.D. student at the Center for Language and Speech Processing at Johns Hopkins University. His work spans social media analysis and natural language processing. He currently works on unsupervised user profile learning.

Umberto Panniello

UMBERTO PANNIELLO ([email protected]; corresponding author) is an assistant professor of Management at Politecnico di Bari, Italy, where he also received his Ph.D. in Business Engineering. He was previously a Visiting Scholar at the Wharton School of the University of Pennsylvania. His main research interests are customer modeling, consumer behavior, social TV, and context-aware and profit-based recommender systems.

Reprints and Corporate Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

To request a reprint or corporate permissions for this article, please click on the relevant link below:

Academic Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

Obtain permissions instantly via Rightslink by clicking on the button below:

If you are unable to obtain permissions via Rightslink, please complete and submit this Permissions form. For more information, please visit our Permissions help page.