353
Views
1
CrossRef citations to date
0
Altmetric
Research Article

Does accuracy matter? Methodological considerations when using automated speech-to-text for social science research

, , &
Pages 661-677 | Received 10 Aug 2021, Accepted 01 Jun 2022, Published online: 17 Jun 2022

References

  • Abe, J. A. A. (2020). Big five, linguistic styles, and successful online learning. The Internet and Higher Education, 45, 100724. https://doi.org/10.1016/j.iheduc.2019.100724
  • Agarwal, S., Godbole, S., Punjani, D., & Roy, S. (2007). How much noise is too much: A study in automatic text classification. Seventh IEEE International Conference on Data Mining (ICDM 2007), 3–12. https://doi.org/10.1109/ICDM.2007.21
  • Agarwal, R., & Dhar, V. (2014). Editorial—big data, data science, and analytics: The opportunity and challenge for IS research. Information Systems Research, 25(3), 443–448. https://doi.org/10.1287/isre.2014.0546
  • Araújo, C. S., Magno, G., Meira, W., Almeida, V., Hartung, P., & Doneda, D. (2017). Characterizing videos, audience and advertising in youtube channels for kids. In G. L. Ciampaglia, A. Mashhadi, & T. Yasseri (Eds.), Social Informatics (pp. 341–359). Springer International Publishing.
  • Bayram, U., & Benhiba, L. (2021). Determining a person’s suicide risk by voting on the short-term history of tweets for the CLPsych 2021 shared task. Proceedings of the Seventh Workshop on Computational Linguistics and Clinical Psychology: Improving Access, 81–86. https://doi.org/10.18653/v1/2021.clpsych-1.8
  • Bazillon, T., Esteve, Y., & Luzzati, D. (2008). Manual vs Assisted Transcription of Prepared and Spontaneous Speech. Lrec.
  • Bird, S., Loper, E., & Klein, E. (2009). Natural Language Processing with Python. O'Reilly Media Inc.
  • Burgoon, J. K. (2018). Predicting veracity from linguistic indicators. Journal of Language and Social Psychology, 37(6), 603–631. https://doi.org/10.1177/0261927X18784119
  • Cable, D. M., & Judge, T. A. (1997). Interviewers’ perceptions of person–organization fit and organizational selection decisions. Journal of Applied Psychology, 82(4), 546–561. https://doi.org/10.1037/0021-9010.82.4.546
  • Chatterjee, S., Goyal, D., Prakash, A., & Sharma, J. (2020). Exploring healthcare/health-product ecommerce satisfaction: A text mining and machine learning application. Journal of Business Research, 131, 815–825. https://doi.org/10.1016/j.jbusres.2020.10.043
  • Cheng, J., Bernstein, M., Danescu-Niculescu-Mizil, C., & Leskovec, J. (2017). Anyone can become a troll: Causes of trolling behavior in online discussions. Proceedings of the 2017 ACM Conference on Computer Supported Cooperative Work and Social Computing, 1217–1230. https://doi.org/10.1145/2998181.2998213
  • Crowston, K., Allen, E. E., & Heckman, R. (2012). Using natural language processing technology for qualitative data analysis. International Journal of Social Research Methodology, 15(6), 523–543. https://doi.org/10.1080/13645579.2011.625764
  • Dempster, P. G., & Woods, D. K. (2011). The economic crisis though the eyes of Transana. Forum Qualitative Sozialforschung/Forum: Qualitative Social Research, 12(1). https://doi.org/10.17169/FQS-12.1.1515
  • DePaulo, B. M., Lindsay, J. J., Malone, B. E., Muhlenbruck, L., Charlton, K., & Cooper, H. (2003). Cues to deception. Psychological Bulletin, 129(1), 74–118. https://doi.org/10.1037/0033-2909.129.1.74
  • Dernoncourt, F., Bui, T., & Chang, W. (2018). A framework for speech recognition benchmarking. Proceedings of Interspeech 2018, 169–170.
  • Dorn, B., Dunbar, N. E., Burgoon, J. K., Nunamaker, J. F., Giles, M., Walls, B., Chen, X., Wang, X., (Rebecca), Ge, S., (Tina), & Subrahmanian, V. S. (2021). A system for multi-person, multi-modal data collection in behavioral information systems. In V. S. Subrahmanian, J. K. Burgoon, & N. E. Dunbar (Eds.), Detecting trust and deception in group interaction (pp. 57–73). Springer International Publishing. https://doi.org/10.1007/978-3-030-54383-9_4
  • Foley, K. A., MacGeorge, E. L., Brinker, D. L., Li, Y., & Zhou, Y. (2020). Health providers’ advising on symptom management for upper respiratory tract infections: Does elaboration of reasoning influence outcomes relevant to antibiotic stewardship? Journal of Language and Social Psychology, 39(3), 349–374. https://doi.org/10.1177/0261927X20912460
  • Fuller, C. M., Biros, D. P., & Wilson, R. L. (2009). Decision support for determining veracity via linguistic-based cues. Decision Support Systems, 46(3), 695–703. https://doi.org/10.1016/j.dss.2008.11.001
  • Glaser, A. (2017). Google’s ability to understand language is nearly equivalent to humans. https://www.recode.net/2017/5/31/15720118/google-understand-language-speech-equivalent-humans-code-conference-mary-meeker
  • Ho, S. M., Hancock, J. T., & Booth, C. (2017). Ethical dilemma: Deception dynamics in computer-mediated group communication. Journal of the Association for Information Science and Technology, 68(12), 2729–2742. https://doi.org/10.1002/asi.23849
  • Holtzman, N. S., Tackman, A. M., Carey, A. L., Brucks, M. S., Küfner, A. C. P., Deters, F. G., Back, M. D., Donnellan, M. B., Pennebaker, J. W., Sherman, R. A., & Mehl, M. R. (2019). Linguistic markers of grandiose narcissism: A LIWC analysis of 15 samples. Journal of Language and Social Psychology, 38(5–6), 773–786. https://doi.org/10.1177/0261927X19871084
  • Humă, B., Stokoe, E., & Sikveland, R. O. (2019). Persuasive conduct: Alignment and resistance in prospecting “cold” calls. Journal of Language and Social Psychology, 38(1), 33–60. https://doi.org/10.1177/0261927X18783474
  • Hung, Y.-C., & Guan, C. (2020). Winning box office with the right movie synopsis. European Journal of Marketing, 54(3), 594–614. https://doi.org/10.1108/EJM-01-2019-0096
  • Këpuska, V., & Bohouta, G. (2017). Comparing speech recognition systems (Microsoft API, Google API and CMU Sphinx). Int. J. Eng. Res. Appl, 7(3), 20–24. https://doi.org/10.9790/9622-0703022024
  • Lin, M., Lucas, H. C., & Shmueli, G. (2013). Research commentary—too big to fail: Large samples and the p-value problem. Information Systems Research, 24(4), 906–917. https://doi.org/10.1287/isre.2013.0480
  • Okdie, B. M., & Rempala, D. M. (2019). Brief textual indicators of political orientation. Journal of Language and Social Psychology, 38(1), 106–125. https://doi.org/10.1177/0261927X18762973
  • Ott, M., Choi, Y., Cardie, C., & Hancock, J. T. (2011). Finding Deceptive Opinion Spam by Any Stretch of the Imagination. Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies, 309–319, Portland, Oregon, USA.
  • Pan, L., McNamara, G., Lee, J. J., Haleblian, J., (John), & Devers, C. E. (2018). Give it to us straight (most of the time): Top managers’ use of concrete language and its effect on investor reactions. Strategic Management Journal, 39(8), 2204–2225. https://doi.org/10.1002/smj.2733
  • Park, G., Schwartz, H. A., Eichstaedt, J. C., Kern, M. L., Kosinski, M., Stillwell, D. J., Ungar, L. H., & Seligman, M. E. P. (2015). Automatic personality assessment through social media language. Journal of Personality and Social Psychology, 108(6), 934–952. https://doi.org/10.1037/pspp0000020
  • Paulus, T., Lester, J., & Dempster, P. (2014). Digital tools for qualitative research. SAGE Publications Ltd. https://doi.org/10.4135/9781473957671.
  • Pennebaker, J. W., Booth, R. J., Boyd, R. L., & Francis, M. E. (2015). Linguistic inquiry and word count: LIWC2015.
  • Prati, F., Menegatti, M., Moscatelli, S., Kana Kenfack, C. S., Pireddu, S., Crocetti, E., Mariani, M. G., & Rubini, M. (2019). Are mixed-gender committees less biased toward female and male candidates? An investigation of competence-, morality-, and sociability-related terms in performance appraisal. Journal of Language and Social Psychology, 38(5–6), 586–605. https://doi.org/10.1177/0261927X19844808
  • Proyer, R. T., & Brauer, K. (2018). Exploring adult playfulness: Examining the accuracy of personality judgments at zero-acquaintance and an LIWC analysis of textual information. Journal of Research in Personality, 73, 12–20. https://doi.org/10.1016/j.jrp.2017.10.002
  • Rev.com. (n.d.). How long does it take to transcribe one hour of audio or video? https://www.rev.com/blog/resources/how-long-does-it-take-to-transcribe-audio-video
  • Saon, G. (2017). Reaching new records in speech recognition. IBM. https://www.ibm.com/blogs/watson/2017/03/reaching-new-records-in-speech-recognition/
  • Souri, A., Hosseinpour, S., & Rahmani, A. M. (2018). Personality classification based on profiles of social networks’ users and the five-factor model of personality. Human-Centric Computing and Information Sciences, 8(1), 24. https://doi.org/10.1186/s13673-018-0147-4
  • Spencer-Oatey, H., & Wang, J. (2019). Culture, context, and concerns about face: Synergistic insights from pragmatics and social psychology. Journal of Language and Social Psychology, 38(4), 423–440. https://doi.org/10.1177/0261927X19865293
  • Twyman, N. W., Pentland, S. J., & Spitzley, L. (2020). Design principles for signal detection in modern job application systems: Identifying fabricated qualifications. Journal of Management Information Systems, 37(3), 849–874. https://doi.org/10.1080/07421222.2020.1790201
  • Walker, D., Lund, W. B., & Ringger, E. (2010). Evaluating models of latent document semantics in the presence of OCR errors. Proceedings of the 2010 Conference on Empirical Methods in Natural Language Processing, 240–250.
  • Wang, P., Yan, M., Zhan, X., Tian, M., Si, Y., Sun, Y., Jiao, L., & Wu, X. (2021). Predicting self-reported proactive personality classification with weibo text and short answer text. IEEE Access, 9, 77203–77211. https://doi.org/10.1109/ACCESS.2021.3078052
  • Zhou, Y., & Fan, Y. (2013). A sociolinguistic study of American slang. Theory and Practice in Language Studies, 3(12), 2209. https://doi.org/10.4304/tpls.3.12.2209-2213
  • Ziemer, K. S., & Korkmaz, G. (2017). Using text to predict psychological and physical health: A comparison of human raters and computerized text analysis. Computers in Human Behavior, 76, 122–127. https://doi.org/10.1016/j.chb.2017.06.038

Reprints and Corporate Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

To request a reprint or corporate permissions for this article, please click on the relevant link below:

Academic Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

Obtain permissions instantly via Rightslink by clicking on the button below:

If you are unable to obtain permissions via Rightslink, please complete and submit this Permissions form. For more information, please visit our Permissions help page.