661
Views
2
CrossRef citations to date
0
Altmetric
Articles

Disentangling User Samples: A Supervised Machine Learning Approach to Proxy-population Mismatch in Twitter Research

ORCID Icon, &

References

  • Abokhodair, N., Yoo, D., & McDonald, D. W. (2015). Dissecting a social botnet: Growth, content and influence in Twitter. In Proceedings of the 18th ACM Conference on Computer Supported Cooperative Work & Social Computing (pp. 839–851). New York, NY: ACM. doi:https://doi.org/10.1145/2675133.2675208
  • Boyd, D., & Crawford, K. (2012). Critical questions for big data. Information, Communication & Society, 15(5), 662–679.
  • Breiman, L. (1984). Classification and Regression Trees. Boca Raton, FL: Chapman & Hall/CRC.
  • Breiman, L. (2001). Random Forests. Retrieved April 10, 2017, from https://www.stat.berkeley.edu/~breiman/randomforest2001.pdf.
  • Brooks, B., Hogan, B., Ellison, N., Lampe, C., & Vitak, J. (2014). Assessing structural correlates to social capital in Facebook ego networks. Social Networks, 38, 1–15.
  • Burrows, R., & Savage, M. (2014). After the crisis? Big data and the methodological challenges of empirical sociology. Big Data & Society, 1(1). doi:10.1177/2053951714540280
  • Chu, Z., Gianvecchio, S., Wang, H., & Jajodia, S. (2012). Detecting automation of Twitter accounts: Are you a human, bot, or cyborg?. IEEE Transactions on Dependable and Secure Computing, 9(6), 811–824.
  • Cohen, R., & Ruth, D. (2013). Classifying political orientation on Twitter: It’s not easy! In Seventh International AAAI Conference on Weblogs and Social Media (pp. 91–99). Cambridge, MA: AAAI Press.
  • Colleoni, E., Rozza, A., & Arvidsson, A. (2014). Echo chamber or public sphere? Predicting political orientation and measuring political homophily in Twitter using big data. Journal of Communication, 64(2), 317–332.
  • Dredze, M., Paul, M. J., Bergsma, S., & Tran, H. (2013, June). Carmen: A twitter geolocation system with applications to public health. Papers from the AAAI workshop on expanding the boundaries of health informatics using (pp. 20–24). Bellevue, WA, USA. Association for the Advancement of Artificial (AAA).
  • Driscoll, K., & Walker, S. (2014). Big data, big questions, working within a black box: Transparency in the collection and production of big Twitter data. International Journal of Communication, 8, 1745–1764.
  • Emery, S. L., Szczypka, G., Abril, E. P., Kim, Y., & Vera, L. (2014). Are you scared yet? Evaluating fear appeal messages in Tweets about the Tips campaign. Journal of Communication, 64(2), 278–295.
  • Engesser, S., & Humprecht, E. (2015). Frequency or skillfulness. Journalism Studies, 16(4), 513–529.
  • González-Bailón, S., Wang, N., Rivero, A., Borge-Holthoefer, J., & Moreno, Y. (2014). Assessing the bias in samples of large online networks. Social Networks, 38, 16–27.
  • Hargittai, E. (2015). Is bigger always better? potential biases of Big data derived from Social Network Sites. The Annals of the American Academy of Political and Social Science, 659(1), 63–76.
  • Hargittai, E., & Litt, E. (2011). The tweet smell of celebrity success: Explaining variation in Twitter adoption among a diverse group of young adults. New Media & Society, 13(5), 824–842.
  • Hermida, A. (2010). Twittering the news. Journalism Practice, 4(3), 297–308.
  • Himelboim, I., McCreery, S., & Smith, M. (2013). Birds of a feather wweet together: Integrating network and content analyses to examine cross-ideology exposure on Twitter. Journal of Computer-Mediated Communication, 18(2), 40–60.
  • Jackson, S. J., & Foucault Welles, B. (2015). Hijacking #myNYPD: Social media dissent and networked counterpublics. Journal of Communication, 65(6), 932–952.
  • Kwon, K. H., Chadha, M., & Pellizzaro, K. (2017). Proximity and terrorism news in social media: A construal-level theoretical approach to networked framing of terrorism in Twitter. Mass Communication and Society, 20(6), 869–894.
  • Kwon, K. H., Stefanone, M. A., & Barnett, G. A. (2014). Social network influence on online behavioral choices: Exploring group formation on Social Network Sites. American Behavioral Scientist, 58(10), 1345–1360.
  • Lee, K., Eoff, B., & Caverlee, J. (2011). Seven months with the devils: A long-term study of content polluters on Twitter. In Fifth International AAAI Conference on Weblogs and Social Media. Barcelona, Spain.
  • Loh, W. (2011). Classification and Regression Trees. Retrieved from http://www.stat.wisc.edu/~loh/treeprogs/guide/wires11.pdf 10.1002/widm.8
  • Mikolov, T., Sutskever, I., Chen, K., Corrado, G. S., & Dean, J. (2013). Distributed representations of words and phrases and their compositionality. Advances in Neural Information Processing Systems, 26, 3111–3119.
  • Morstatter, F., Pfeffer, J., Liu, H., & Carley, K. M. (2013). Is the sample good enough? Comparing data from Twitter’s Streaming API with Twitter’s Firehose. arXiv:1306.5204 [Physics]. Retrieved from http://arxiv.org/abs/1306.5204
  • Murthy, D., Bowman, S., Gross, A. J., & McGarry, M. (2015). Do we tweet differently from our mobile devices? A study of language differences on mobile and web-based Twitter platforms. Journal of Communication, 65(5), 816–837.
  • Papacharissi, Z. (2015). Affective Publics: Sentiment, Technology, and Politics. New York, NY: Oxford University Press.
  • Pennebaker, J. W., Booth, R. J., Boyd, R. L., & Francis, M. E. (2015). Linguistic Inquiry and Word Count: LIWC2015. Austin, TX: Pennebaker Conglomerates. www.LIWC.net
  • Ratkiewicz, J., Conover, M., Miess, M., Goncalves, B., Patil, S., Flammini, A., & Menczer, F. (2011). Truthy: Mapping the spread of astroturf in microblog streams. In WWW 2011 Proceedings of the 20th International Conference Companion on World Wide Web (pp. 249–252). Hyderabad, India: ACM.
  • Ruths, D., & Pfeffer, J. (2014). Social media for large studies of behavior. Science, 346(6213), 1063–1064.
  • Salganik, M. (2017). Bit by Bit: Social Research in the Digital Age. Princeton, NJ: Princeton University Press.
  • Schroeder, R. (2014). Big data and the brave new world of social media research. Big Data & Society, 1(2), 1–11.
  • Shin, J., & Thorson, K. (2017). Partisan selective sharing: The biased diffusion of fact-checking messages on social media. Journal of Communication, 67, (2), 233–255. doi:10.1111/jcom.12284
  • Smith, M., Ceni, A., Milic-Frayling, N., Shneiderman, B., Mendes Rodrigues, E., Leskovec, J., & Dunne, C., (2010). NodeXL: A free and open network overview, discovery and exploration add-in for Excel 2007/2010/2013/2016, http://nodexl.codeplex.com/. Social Media Research Foundation
  • Ugander, J., Karrer, B., Backstrom, L., & Marlow, C. (2011). The anatomy of the Facebook social graph. arXiv:1111.4503 [Physics]. Retrieved from http://arxiv.org/abs/1111.4503
  • Vaccari, C., Chadwick, A., & O’Loughlin, B. (2015). Dual screening the political: Media events, social media, and citizen engagement. Journal of Communication, 65(6), 1041–1061.
  • Vargo, C. J., Guo, L., McCombs, M., & Shaw, D. L. (2014). Network issue agendas on Twitter during the 2012 U.S. presidential election. Journal of Communication, 64(2), 296–316.
  • Varol, O., Ferrara, E., Davis, C. A., Menczer, F., & Flammini, A. (2017). Online human-bot interactions: Detection, estimation, and characterization. arXiv:1703.03107 [Cs]. Retrieved from http://arxiv.org/abs/1703.03107
  • Woodford, D., Walker, S., & Paul, A. (2013). Slicing big data - Twitter, gambling and time sensitive information. In Selected Papers of Internet Research . 14.0. Denver, CO. Association of Internet Research. Retrieved from http://spir.aoir.org/index.php/spir/article/view/914
  • Zhou, W.-X., Sornette, D., Hill, R. A., & Dunbar, R. I. M. (2005). Discrete hierarchical organization of social group sizes. Proceedings of the Royal Society of London B: Biological Sciences, 272(1561), 439–444.

Reprints and Corporate Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

To request a reprint or corporate permissions for this article, please click on the relevant link below:

Academic Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

Obtain permissions instantly via Rightslink by clicking on the button below:

If you are unable to obtain permissions via Rightslink, please complete and submit this Permissions form. For more information, please visit our Permissions help page.