ABSTRACT
Social media data are increasingly used by researchers to gain insights on individuals’ behaviors and opinions. Platforms like Twitter provide access to individuals’ postings, networks of friends and followers, and the content to which they are exposed. This article presents the methods and results of an exploratory study to supplement survey data with respondents’ Twitter postings, networks of Twitter friends and followers, and information to which they were exposed about e-cigarettes. Twitter use is important to consider in e-cigarette research and other topics influenced by online information sharing and exposure. Further, Twitter metadata provide direct measures of user’s friends and followers as opposed to survey self-reports. We find that Twitter metadata provide similar information to survey questions on Twitter network size without inducing recall error or other measurement issues. Using sentiment coding and machine learning methods, we find Twitter can elucidate on topics difficult to measure via surveys such as online expressed opinions and network composition. We present and discuss models predicting whether respondents’ tweet positively about e-cigarettes using survey and Twitter data, finding the combined data to provide broader measures than either source alone.
Disclosure statement
No potential conflict of interest was reported by the authors.
Notes on contributors
Joe Murphy is senior survey methodologist in RTI International’s Survey Research Division. His research focuses on the application of new technologies to improve the quality, relevance, and efficiency of survey research.
Y. Patrick Hsieh is a survey methodologist and digital sociologist in the Program for Research in Survey Methodology (PRISM) in RTI International’s Survey Research Division. His expertise includes developing mixed-method research, designing social media campaign for sample recruitment, developing traditional and Web-based questionnaires, and integrating digital technologies and technology-enabled research practices into survey methodology to improve research design and data quality.
Michael Wenger is a data scientist in RTI International’s Division for Statistical & Data Sciences. Mr. Wenger uses his expertise in data visualization, data management, and machine learning to help inform decision making in public health, social science, and environmental applications.
Annice E. Kim is a senior social scientist in RTI International’s Public Health Policy Research Program. Her primary research focuses on monitoring and evaluating the impact of tobacco industry marketing practices across media channels, evaluating audience’s exposure to and engagement with Web-based health campaigns, and tracking public discourse on health issues in traditional news and online media.
Rob Chew is a research data scientist and program manager in RTI’s Center for Data Science. He uses his expertise in machine learning, data visualization, software development, and computational social science to help subject matter experts solve their complex data problems.
Notes
1 See https://developer.twitter.com/en/docs/basics/authentication/overview/oauth for more information.
2 Social media platforms do not provide a reliable method to determine the number who actually saw the ad.
3 For more information about Twitter APIs, see https://developer.twitter.com/en/docs