Search in:

International Journal of Social Research Methodology Volume 26, 2023 - Issue 6

Submit an article Journal homepage

353

Views

CrossRef citations to date

Altmetric

Research Article

Does accuracy matter? Methodological considerations when using automated speech-to-text for social science research

Steven J. Pentlanda Information Technology & Supply Chain Management, College of Business & Economics, Boise State University, Boise, ID, USACorrespondence[email protected]
View further author information

Christie M. Fullera Information Technology & Supply Chain Management, College of Business & Economics, Boise State University, Boise, ID, USAView further author information

Lee A. Spitzleyb Information Security and Digital Forensics, School of Business, University at Albany, Albany, NY, USAView further author information

Douglas P. Twitchella Information Technology & Supply Chain Management, College of Business & Economics, Boise State University, Boise, ID, USAView further author information

Pages 661-677 | Received 10 Aug 2021, Accepted 01 Jun 2022, Published online: 17 Jun 2022

Cite this article
https://doi.org/10.1080/13645579.2022.2087849
CrossMark

Full Article
Figures & data
References
Citations
Metrics
Reprints & Permissions

References

Abe, J. A. A. (2020). Big five, linguistic styles, and successful online learning. The Internet and Higher Education, 45, 100724. https://doi.org/10.1016/j.iheduc.2019.100724
Web of Science ®Google Scholar
Agarwal, S., Godbole, S., Punjani, D., & Roy, S. (2007). How much noise is too much: A study in automatic text classification. Seventh IEEE International Conference on Data Mining (ICDM 2007), 3–12. https://doi.org/10.1109/ICDM.2007.21
Google Scholar
Agarwal, R., & Dhar, V. (2014). Editorial—big data, data science, and analytics: The opportunity and challenge for IS research. Information Systems Research, 25(3), 443–448. https://doi.org/10.1287/isre.2014.0546
Web of Science ®Google Scholar
Araújo, C. S., Magno, G., Meira, W., Almeida, V., Hartung, P., & Doneda, D. (2017). Characterizing videos, audience and advertising in youtube channels for kids. In G. L. Ciampaglia, A. Mashhadi, & T. Yasseri (Eds.), Social Informatics (pp. 341–359). Springer International Publishing.
Google Scholar
Bayram, U., & Benhiba, L. (2021). Determining a person’s suicide risk by voting on the short-term history of tweets for the CLPsych 2021 shared task. Proceedings of the Seventh Workshop on Computational Linguistics and Clinical Psychology: Improving Access, 81–86. https://doi.org/10.18653/v1/2021.clpsych-1.8
Google Scholar
Bazillon, T., Esteve, Y., & Luzzati, D. (2008). Manual vs Assisted Transcription of Prepared and Spontaneous Speech. Lrec.
Google Scholar
Bird, S., Loper, E., & Klein, E. (2009). Natural Language Processing with Python. O'Reilly Media Inc.
Google Scholar
Burgoon, J. K. (2018). Predicting veracity from linguistic indicators. Journal of Language and Social Psychology, 37(6), 603–631. https://doi.org/10.1177/0261927X18784119
Web of Science ®Google Scholar
Cable, D. M., & Judge, T. A. (1997). Interviewers’ perceptions of person–organization fit and organizational selection decisions. Journal of Applied Psychology, 82(4), 546–561. https://doi.org/10.1037/0021-9010.82.4.546
PubMed Web of Science ®Google Scholar
Chatterjee, S., Goyal, D., Prakash, A., & Sharma, J. (2020). Exploring healthcare/health-product ecommerce satisfaction: A text mining and machine learning application. Journal of Business Research, 131, 815–825. https://doi.org/10.1016/j.jbusres.2020.10.043
Web of Science ®Google Scholar
Cheng, J., Bernstein, M., Danescu-Niculescu-Mizil, C., & Leskovec, J. (2017). Anyone can become a troll: Causes of trolling behavior in online discussions. Proceedings of the 2017 ACM Conference on Computer Supported Cooperative Work and Social Computing, 1217–1230. https://doi.org/10.1145/2998181.2998213
Google Scholar
Crowston, K., Allen, E. E., & Heckman, R. (2012). Using natural language processing technology for qualitative data analysis. International Journal of Social Research Methodology, 15(6), 523–543. https://doi.org/10.1080/13645579.2011.625764
Web of Science ®Google Scholar
Dempster, P. G., & Woods, D. K. (2011). The economic crisis though the eyes of Transana. Forum Qualitative Sozialforschung/Forum: Qualitative Social Research, 12(1). https://doi.org/10.17169/FQS-12.1.1515
Google Scholar
DePaulo, B. M., Lindsay, J. J., Malone, B. E., Muhlenbruck, L., Charlton, K., & Cooper, H. (2003). Cues to deception. Psychological Bulletin, 129(1), 74–118. https://doi.org/10.1037/0033-2909.129.1.74
PubMed Web of Science ®Google Scholar
Dernoncourt, F., Bui, T., & Chang, W. (2018). A framework for speech recognition benchmarking. Proceedings of Interspeech 2018, 169–170.
Google Scholar
Dorn, B., Dunbar, N. E., Burgoon, J. K., Nunamaker, J. F., Giles, M., Walls, B., Chen, X., Wang, X., (Rebecca), Ge, S., (Tina), & Subrahmanian, V. S. (2021). A system for multi-person, multi-modal data collection in behavioral information systems. In V. S. Subrahmanian, J. K. Burgoon, & N. E. Dunbar (Eds.), Detecting trust and deception in group interaction (pp. 57–73). Springer International Publishing. https://doi.org/10.1007/978-3-030-54383-9_4
Google Scholar
Foley, K. A., MacGeorge, E. L., Brinker, D. L., Li, Y., & Zhou, Y. (2020). Health providers’ advising on symptom management for upper respiratory tract infections: Does elaboration of reasoning influence outcomes relevant to antibiotic stewardship? Journal of Language and Social Psychology, 39(3), 349–374. https://doi.org/10.1177/0261927X20912460
Web of Science ®Google Scholar
Fuller, C. M., Biros, D. P., & Wilson, R. L. (2009). Decision support for determining veracity via linguistic-based cues. Decision Support Systems, 46(3), 695–703. https://doi.org/10.1016/j.dss.2008.11.001
Web of Science ®Google Scholar
Glaser, A. (2017). Google’s ability to understand language is nearly equivalent to humans. https://www.recode.net/2017/5/31/15720118/google-understand-language-speech-equivalent-humans-code-conference-mary-meeker
Google Scholar
Ho, S. M., Hancock, J. T., & Booth, C. (2017). Ethical dilemma: Deception dynamics in computer-mediated group communication. Journal of the Association for Information Science and Technology, 68(12), 2729–2742. https://doi.org/10.1002/asi.23849
Web of Science ®Google Scholar
Holtzman, N. S., Tackman, A. M., Carey, A. L., Brucks, M. S., Küfner, A. C. P., Deters, F. G., Back, M. D., Donnellan, M. B., Pennebaker, J. W., Sherman, R. A., & Mehl, M. R. (2019). Linguistic markers of grandiose narcissism: A LIWC analysis of 15 samples. Journal of Language and Social Psychology, 38(5–6), 773–786. https://doi.org/10.1177/0261927X19871084
Web of Science ®Google Scholar
Humă, B., Stokoe, E., & Sikveland, R. O. (2019). Persuasive conduct: Alignment and resistance in prospecting “cold” calls. Journal of Language and Social Psychology, 38(1), 33–60. https://doi.org/10.1177/0261927X18783474
Web of Science ®Google Scholar
Hung, Y.-C., & Guan, C. (2020). Winning box office with the right movie synopsis. European Journal of Marketing, 54(3), 594–614. https://doi.org/10.1108/EJM-01-2019-0096
Web of Science ®Google Scholar
Këpuska, V., & Bohouta, G. (2017). Comparing speech recognition systems (Microsoft API, Google API and CMU Sphinx). Int. J. Eng. Res. Appl, 7(3), 20–24. https://doi.org/10.9790/9622-0703022024
Google Scholar
Lin, M., Lucas, H. C., & Shmueli, G. (2013). Research commentary—too big to fail: Large samples and the p-value problem. Information Systems Research, 24(4), 906–917. https://doi.org/10.1287/isre.2013.0480
Web of Science ®Google Scholar
Okdie, B. M., & Rempala, D. M. (2019). Brief textual indicators of political orientation. Journal of Language and Social Psychology, 38(1), 106–125. https://doi.org/10.1177/0261927X18762973
Web of Science ®Google Scholar
Ott, M., Choi, Y., Cardie, C., & Hancock, J. T. (2011). Finding Deceptive Opinion Spam by Any Stretch of the Imagination. Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies, 309–319, Portland, Oregon, USA.
Google Scholar
Pan, L., McNamara, G., Lee, J. J., Haleblian, J., (John), & Devers, C. E. (2018). Give it to us straight (most of the time): Top managers’ use of concrete language and its effect on investor reactions. Strategic Management Journal, 39(8), 2204–2225. https://doi.org/10.1002/smj.2733
Web of Science ®Google Scholar
Park, G., Schwartz, H. A., Eichstaedt, J. C., Kern, M. L., Kosinski, M., Stillwell, D. J., Ungar, L. H., & Seligman, M. E. P. (2015). Automatic personality assessment through social media language. Journal of Personality and Social Psychology, 108(6), 934–952. https://doi.org/10.1037/pspp0000020
PubMed Web of Science ®Google Scholar
Paulus, T., Lester, J., & Dempster, P. (2014). Digital tools for qualitative research. SAGE Publications Ltd. https://doi.org/10.4135/9781473957671.
Google Scholar
Pennebaker, J. W., Booth, R. J., Boyd, R. L., & Francis, M. E. (2015). Linguistic inquiry and word count: LIWC2015.
Google Scholar
Prati, F., Menegatti, M., Moscatelli, S., Kana Kenfack, C. S., Pireddu, S., Crocetti, E., Mariani, M. G., & Rubini, M. (2019). Are mixed-gender committees less biased toward female and male candidates? An investigation of competence-, morality-, and sociability-related terms in performance appraisal. Journal of Language and Social Psychology, 38(5–6), 586–605. https://doi.org/10.1177/0261927X19844808
Web of Science ®Google Scholar
Proyer, R. T., & Brauer, K. (2018). Exploring adult playfulness: Examining the accuracy of personality judgments at zero-acquaintance and an LIWC analysis of textual information. Journal of Research in Personality, 73, 12–20. https://doi.org/10.1016/j.jrp.2017.10.002
Web of Science ®Google Scholar
Rev.com. (n.d.). How long does it take to transcribe one hour of audio or video? https://www.rev.com/blog/resources/how-long-does-it-take-to-transcribe-audio-video
Google Scholar
Saon, G. (2017). Reaching new records in speech recognition. IBM. https://www.ibm.com/blogs/watson/2017/03/reaching-new-records-in-speech-recognition/
Google Scholar
Souri, A., Hosseinpour, S., & Rahmani, A. M. (2018). Personality classification based on profiles of social networks’ users and the five-factor model of personality. Human-Centric Computing and Information Sciences, 8(1), 24. https://doi.org/10.1186/s13673-018-0147-4
Google Scholar
Spencer-Oatey, H., & Wang, J. (2019). Culture, context, and concerns about face: Synergistic insights from pragmatics and social psychology. Journal of Language and Social Psychology, 38(4), 423–440. https://doi.org/10.1177/0261927X19865293
Web of Science ®Google Scholar
Twyman, N. W., Pentland, S. J., & Spitzley, L. (2020). Design principles for signal detection in modern job application systems: Identifying fabricated qualifications. Journal of Management Information Systems, 37(3), 849–874. https://doi.org/10.1080/07421222.2020.1790201
Web of Science ®Google Scholar
Walker, D., Lund, W. B., & Ringger, E. (2010). Evaluating models of latent document semantics in the presence of OCR errors. Proceedings of the 2010 Conference on Empirical Methods in Natural Language Processing, 240–250.
Google Scholar
Wang, P., Yan, M., Zhan, X., Tian, M., Si, Y., Sun, Y., Jiao, L., & Wu, X. (2021). Predicting self-reported proactive personality classification with weibo text and short answer text. IEEE Access, 9, 77203–77211. https://doi.org/10.1109/ACCESS.2021.3078052
Web of Science ®Google Scholar
Zhou, Y., & Fan, Y. (2013). A sociolinguistic study of American slang. Theory and Practice in Language Studies, 3(12), 2209. https://doi.org/10.4304/tpls.3.12.2209-2213
Google Scholar
Ziemer, K. S., & Korkmaz, G. (2017). Using text to predict psychological and physical health: A comparison of human raters and computerized text analysis. Computers in Human Behavior, 76, 122–127. https://doi.org/10.1016/j.chb.2017.06.038
Web of Science ®Google Scholar

Reprints and Corporate Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

To request a reprint or corporate permissions for this article, please click on the relevant link below:

Order Reprints Request Corporate Permissions

Academic Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

Obtain permissions instantly via Rightslink by clicking on the button below:

Request Academic Permissions

If you are unable to obtain permissions via Rightslink, please complete and submit this Permissions form. For more information, please visit our Permissions help page.

Share icon
Back to Top

Related research

People also read lists articles that other readers of this article have read.

Recommended articles lists articles that we recommend and is powered by our AI driven recommendation engine.

Cited by lists all citing articles based on Crossref citations.
Articles with the Crossref icon will open in a new tab.

People also read
Recommended articles
Cited by

To cite this article:

Reference style: APA Chicago Harvard

Citation copied to clipboard

Reference styles above use APA (6th edition), Chicago (16th edition) & Harvard (10th edition)

Download citation

Download a citation file in RIS format that can be imported by citation management software including EndNote, ProCite, RefWorks and Reference Manager.

Choose format: RIS BibTex RefWorks Direct Export

Choose options: Citation Citation & abstract Citation & references

Does accuracy matter? Methodological considerations when using automated speech-to-text for social science research

References

Information for

Open access

Opportunities

Help and information

Your download is now in progress and you may close this window

Login or register to access this feature

Does accuracy matter? Methodological considerations when using automated speech-to-text for social science research

References

Reprints and Corporate Permissions

Academic Permissions

Related research

To cite this article:

Download citation

Information for

Open access

Opportunities

Help and information

Keep up to date