Improving Error Correction and Text Editing Using Voice and Mouse Multimodal Interface

Meirav Taieb-MaimonDepartment of Software and Information Systems Engineering, Ben-Gurion University of the Negev, Beer-Sheva, IsraelCorrespondence[email protected]

https://orcid.org/0000-0002-1352-1863 View further author information

Luiza Romanovskii-ChernikDepartment of Software and Information Systems Engineering, Ben-Gurion University of the Negev, Beer-Sheva, IsraelView further author information

Received 15 Aug 2023, Accepted 02 May 2024, Published online: 22 May 2024

Cite this article
https://doi.org/10.1080/10447318.2024.2352932
CrossMark

Full Article
Figures & data
References
Citations
Metrics
Licensing
Reprints & Permissions
View PDF PDF View EPUB EPUB

References

Agirre, E., Gojenola, K., & Sarasola, K. (1998). Towards a single proposal is spelling correction. arXiv preprint cmp-lg/9806010.
Google Scholar
Alharbi, O., Arif, A. S., Stuerzlinger, W., Dunlop, M. D., & Komninos, A. (2019). WiseType: A tablet keyboard with color-coded visualization and various editing options for error correction [Paper presentation]. Graphics Interface 2019.
Google Scholar
Amell, T. K., & Kumar, S. (2000). Cumulative trauma disorders and keyboarding work. International Journal of Industrial Ergonomics, 25(1), 69–78. https://doi.org/10.1016/S0169-8141(98)00099-7
Web of Science ®Google Scholar
Azenkot, S., & Lee, N. B. (2013). Exploring the use of speech input by blind people on mobile devices [Paper presentation]. Proceedings of the 15th International ACM SIGACCESS Conference on Computers and Accessibility, October, (pp. 1–8). https://doi.org/10.1145/2513383.2513440
Google Scholar
Bangor, A., Kortum, P., & Miller, J. (2009). Determining what individual SUS scores mean: Adding an adjective rating scale. Journal of usability studies, 4(3), 114–123.
Google Scholar
Basri, S. B., Alfred, R., On, C. K. (2012). Automatic spell checker for Malay blog. 2012 IEEE International Conference on Control System, Computing and Engineering, November, (pp. 506–510). IEEE.
Google Scholar
Becker, T., Blaylock, N., Gerstenberger, C., Kruijff-Korbayová, I., Korthauer, A., Pinkal, M., … Schehl, J. (2006, May). Natural and intuitive multimodal dialogue for in-car applications: The SAMMIE system. In ECAI (Vol. 6, pp. 612–616).
Google Scholar
Beelders, T. R., & Blignaut, P. J. (2010) Using vision and voice to create a multimodal interface for Microsoft Word 2007 [Paper presentation]. Proceedings of the 2010 Symposium on Eye-Tracking Research & Applications, March, (pp. 173–176). https://doi.org/10.1145/1743666.1743709
Google Scholar
Bolt, R. A., (1980). Put-that-there” Voice and gesture at the graphics interface [Paper presentation]. Proceedings of the 7th Annual Conference on Computer Graphics and Interactive Techniques, July, (pp. 262–270). https://doi.org/10.1145/800250.807503
Google Scholar
Brownlow, N., Shein, F., Thomas, D., Milner, M., & Parnes, P. (1989). Direct manipulation: Problems with pointing devices. In Resna, June (Vol. 89, pp. 246–247).
Google Scholar
Bu, Y., & Guo, P. (2023). Voice orientation recognition: New paradigm of speech-based human-computer interaction. International Journal of Human–Computer Interaction, 1–20. https://doi.org/10.1080/10447318.2023.2233128
Web of Science ®Google Scholar
Bullinger, H. J., Ziegler, J., & Bauer, W. (2002). Intuitive human-computer interaction-toward a user-friendly information society. International Journal of Human-Computer Interaction, 14(1), 1–23. https://doi.org/10.1207/S15327590IJHC1401_1
Web of Science ®Google Scholar
Castellina, E., Corno, F., & Pellegrino, P. (2008). Integrated speech and gaze control for realistic desktop environments [Paper presentation]. Proceedings of the 2008 Symposium on Eye Tracking Research & Applications, March, (pp. 79–82). https://doi.org/10.1145/1344471.1344492
Google Scholar
Chu, S. M., & Povey, D. (2010). Speaking rate adaptation using continuous frame rate normalization [Paper presentation]. 2010 IEEE International Conference on Acoustics, Speech and Signal Processing, March (pp. 4306–4309). IEEE. https://doi.org/10.1109/ICASSP.2010.5495656
Google Scholar
Commarford, P. M., & Lewis, J. R. (2004). Models of throughput rates for dictation and voice spelling for handheld devices. International Journal of Speech Technology, 7(1), 69–79. https://doi.org/10.1023/B:IJST.0000004809.33755.b7
Google Scholar
Czaja, S. J. (1997). Computer technology and the older adult. In M. G. Helander, T. K. Landauer, & P. V. Prabhu (Eds.), Handbook of human-computer interaction (2nd ed., pp. 797–812). North-Holland. https://doi.org/10.1016/B978-044481862-1.50100-X
Google Scholar
Czaja, S. J., & Lee, C. C. (2002). Designing computer systems for older adults. In The human-computer interaction handbook: Fundamentals, evolving technologies and emerging applications (pp. 413–427). L. Erlbaum Associates Inc.
Google Scholar
Danis, C., Comerford, L., Janke, E., Davies, K., De Vries, J., & Bertrand, A. (1994). Conference companion on Human factors in computing systems,. Storywriter: A speech oriented editor, April (pp. 277–278). https://doi.org/10.1145/259963.260490
Google Scholar
De Mauro, C., Gori, M., Maggini, M., Martinelli, E. (1999). A voice device with an application-adapted protocol for microsoft windows. In Proceedings IEEE International Conference on Multimedia Computing and Systems, June, (Vol. 2, pp. 1015–1016.). IEEE.
Google Scholar
Fan, J., Xu, C., Yu, C., & Shi, Y. (2021). Just speak it: Minimize cognitive load for eyes-free text editing with a smart voice assistant [Paper presentation]. The 34th Annual ACM Symposium on User Interface Software and Technology, October. (pp. 910–921). https://doi.org/10.1145/3472749.3474795
Google Scholar
Feng, J., & Sears, A. (2007). Interaction techniques for users with spinal injuries: A speech-based solution. Universal Usability: Designing Computer Interfaces for Diverse User Populations.
Google Scholar
Findlater, L., Jansen, A., Shinohara, K., Dixon, M., Kamb, P., Rakita, J., Wobbrock, J. O. (2010). Enhanced area cursors: Reducing fine pointing demands for people with motor impairments. In Proceedings of the 23nd Annual ACM Symposium on User Interface Software and Technology, October, (pp. 153–162).
Google Scholar
Flor, M., Futagi, Y. (2012). On using context for automatic correction of non-word misspellings in student essays. In Proceedings of the Seventh Workshop on Building Educational Applications Using NLP, June. (pp. 105–115).
Google Scholar
Ghosh, D., Foong, P. S., Zhao, S., Liu, C., Janaka, N., Erusu, V. (2020a). Eyeditor: Towards on-the-go heads-up text editing using voice and manual input. In Proceedings of the 2020 CHI Conference on Human Factors in Computing Systems, April, (pp. 1–13).
Google Scholar
Ghosh, D., Liu, C., Zhao, S., & Hara, K. (2020b). Commanding and re-dictation: Developing eyes-free voice-based interaction for editing dictated text. ACM Transactions on Computer-Human Interaction, 27(4), 1–31. https://doi.org/10.1145/3390889
Web of Science ®Google Scholar
Google. (n.d). Handsfree for web - voice control. Google. Retrieved September 17, 2022, from https://chrome.google.com/webstore/detail/handsfree-for-web-voice-c/ldfboinpfdahkgnljbkohgimhimmafip?hl=en
Google Scholar
Gu, J., & Lee, G. (2023). Towards more direct text editing with handwriting interfaces. International Journal of Human–Computer Interaction, 39(1), 233–248. https://doi.org/10.1080/10447318.2022.2041893
Web of Science ®Google Scholar
Hamaker, E. L. (2012). Why researchers should think" within-person": A paradigmatic rationale.
Google Scholar
Hwang, F., Keates, S., Langdon, P., & Clarkson, P. J. (2003). Multiple haptic targets for motion-impaired computer users [Paper presentation]. Proceedings of the SIGCHI Conference on Human Factors in Computing Systems, April, (pp. 41–48). https://doi.org/10.1145/642611.642620
Google Scholar
Jacob, R. J. K. (1995). Eye tracking in advanced interface design. In W. Barfield & T. A. Furness (Eds.), Virtual environments and advanced interface design (pp. 258–288). Oxford University Press.
Google Scholar
Jacob, R. J., Leggett, J. J., Myers, B. A., & Pausch, R. (1993). Interaction styles and input/output devices. Behaviour & Information Technology, 12(2), 69–79. https://doi.org/10.1080/01449299308924369
Web of Science ®Google Scholar
Karat, C. M., Halverson, C., Horn, D., & Karat, J. (1999). Patterns of entry and correction in large vocabulary continuous speech recognition systems [Paper presentation]. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems, May, (pp. 568–575). https://doi.org/10.1145/302979.303160
Google Scholar
Kettebekov, S., & Sharma, R. (2000). Understanding gestures in multimodal human computer interaction. International Journal on Artificial Intelligence Tools, 09(02), 205–223. https://doi.org/10.1142/S021821300000015X
Google Scholar
Kishi, R., & Hayashi, T. (2015). Effective gazewriting with support of text copy and paste [Paper presentation]. 2015 IEEE/ACIS 14th International Conference on Computer and Information Science (ICIS), June, (pp. 125–130). IEEE. https://doi.org/10.1109/ICIS.2015.7166581
Google Scholar
Kukich, K. (1992). Techniques for automatically correcting words in text. ACM Computing Surveys, 24(4), 377–439. https://doi.org/10.1145/146370.146380
Google Scholar
Larson, K., & Mowatt, D. (2003). Speech error correction: The story of the alternates list. International Journal of Speech Technology, 6(2), 183–194. https://doi.org/10.1023/A:1022342732234
Google Scholar
Lewis, J. R. (1999). Effect of error correction strategy on speech dictation throughput [Paper presentation]. Proceedings of the Human Factors and Ergonomics Society Annual Meeting, September. (Vol. 43, No. 5, pp. 457–461). SAGE Publications. https://doi.org/10.1177/154193129904300514
Google Scholar
Luo, Y., Lee, B., Kim, Y. H., & Choe, E. K. (2022). NoteWordy: Investigating touch and speech input on smartphones for personal data capture. Proceedings of the ACM on Human-Computer Interaction, 6(ISS), 568–591. https://doi.org/10.1145/3567734
Google Scholar
Madison, J. (2012). Damn you, autocorrect!. Random House.
Google Scholar
Maglio, P. P., Matlock, T., Campbell, C. S., Zhai, S., & Smith, B. A. (2000). Gaze and speech in attentive user interfaces. In International Conference on Multimodal Interfaces, October (pp. 1–7). Springer.
Google Scholar
Majaranta, P., Majaranta, N., Daunys, G., Špakov, O. (2009). May)Text editing by gaze: Static vs. dynamic menus. In Proceedings of the 5th Conference on Communication by Gaze Interaction (COGAIN) (pp. 19–24.).
Google Scholar
Mantravadi, C. S. (2009). Adaptive multimodal integration of speech and gaze. Rutgers the State University of New Jersey-New Brunswick.
Google Scholar
McNair, A. E., & Waibel, A. (1994). Improving recognizer acceptance through robust, natural speech repair [Paper presentation]. Third International Conference on Spoken Language Processing.
Google Scholar
Mihara, Y., Shibayama, E., & Takahashi, S. (2005, October). The migratory cursor: Accurate speech-based cursor movement by moving multiple ghost cursors using non-verbal vocalizations. In Proceedings of the 7th International ACM SIGACCESS Conference on Computers and Accessibility (pp. 76–83).
Google Scholar
Mitton, R. (1996). English spelling and the computer. Longman Group.
Google Scholar
Mitton, R. (2009). Ordering the suggestions of a spellchecker without using context. Natural Language Engineering, 15(2), 173–192. https://doi.org/10.1017/S1351324908004804
Google Scholar
Nielsen, J. (1993). Noncommand user interfaces. Communications of the ACM, 36(4), 83–99. https://doi.org/10.1145/255950.153582
Web of Science ®Google Scholar
Oviatt, S. (1997). Mulitmodal interactive maps: Designing for human performance. Human-Computer Interaction, 12(1), 93–129. https://doi.org/10.1207/s15327051hci1201&2_4
Web of Science ®Google Scholar
Oviatt, S., & Cohen, P. (2000). Perceptual user interfaces: Multimodal interfaces that process what comes naturally. Communications of the ACM, 43(3), 45–53. https://doi.org/10.1145/330534.330538
Web of Science ®Google Scholar
Oviatt, S., Cohen, P., Wu, L., Duncan, L., Suhm, B., Bers, J., Holzman, T., Winograd, T., Landay, J., Larson, J., & Ferro, D. (2000). Designing the user interface for multimodal speech and pen-based gesture applications: State-of-the-art systems and future research directions. Human–Computer Interaction, 15(4), 263–322. https://doi.org/10.1207/S15327051HCI1504_1
Web of Science ®Google Scholar
Pedler, J. (2007). Computer correction of real-word spelling errors in dyslexic text [Doctoral dissertation]. University of London.
Google Scholar
Peterson, J. L. (1986). A note on undetected typing errors. Communications of the ACM, 29(7), 633–637. https://doi.org/10.1145/6138.6146
Web of Science ®Google Scholar
Portela, M. V., & Rozado, D. (2014). Gaze enhanced speech recognition for truly hands-free and efficient text input during hci [Paper presentation]. Proceedings of the 26th Australian Computer-Human Interaction Conference on Designing Futures: The Future of Design, December (pp. 426–429). https://doi.org/10.1145/2686612.2686679
Google Scholar
Rello, L., Ballesteros, M., & Bigham, J. P. (2015). A spellchecker for dyslexia [Paper presentation]. Proceedings of the 17th International ACM SIGACCESS Conference on Computers & Accessibility, October (pp. 39–47). https://doi.org/10.1145/2700648.2809850
Google Scholar
Riviere, C. N., & Thakor, N. V. (1996). Effects of age and disability on tracking tasks with a computer mouse: Accuracy and linearity. Journal of Rehabilitation Research and Development, 33(1), 6–15. http://europepmc.org/abstract/MED/8868412
PubMed Web of Science ®Google Scholar
Ruan, S., Wobbrock, J. O., Liou, K., Ng, A., & Landay, J. A. (2018). Comparing speech and keyboard text entry for short messages in two languages on touchscreen phones. Proceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies, 1(4), 1–23. https://doi.org/10.1145/3161187
Google Scholar
Ruiz, N., Chen, F., Oviatt, S., Thiran, J. P., Marques, F., & Bourlard, H. (2009). Multimodal input. Multimodal Signal Processing: Theory and applications for human-computer interaction, 231–255.
Google Scholar
Sears, A., Feng, J., Oseitutu, K., & Karat, C. M. (2003). Hands-free, speech-based navigation during dictation: Difficulties, consequences, and solutions. Human-Computer Interaction, 18(3), 229–257. https://doi.org/10.1207/S15327051HCI1803_2
Web of Science ®Google Scholar
Sears, A., Karat, C. M., Oseitutu, K., Karimullah, A., & Feng, J. (2001). Productivity, satisfaction, and interaction strategies of individuals with spinal cord injuries and traditional users interacting with speech recognition software. Universal Access in the Information Society, 1(1), 4–15. https://doi.org/10.1007/s102090100001
Google Scholar
Sengupta, K., Ke, M., Menges, R., Kumar, C., Staab, S. (2018). Hands-free web browsing: Enriching the user experience with gaze and voice modality. In Proceedings of the 2018 ACM Symposium on Eye Tracking Research & Applications, June (pp. 1–3).
Google Scholar
Sengupta, K., Bhattarai, S., Sarcar, S., MacKenzie, I. S., & Staab, S. (2020). Leveraging error correction in voice-based text entry by Talk-and-Gaze [Paper presentation]. Proceedings of the 2020 CHI Conference on Human Factors in Computing Systems, April (pp. 1–11). https://doi.org/10.1145/3313831.3376579
Google Scholar
Shanis, J. M., & Hedge, A. (2003). Comparison of mouse, touchpad and multitouch input technologies [Paper presentation]. Proceedings of the Human Factors and Ergonomics Society Annual Meeting, October (Vol. 47, No. 4, pp. 746–750). SAGE Publications. https://doi.org/10.1177/154193120304700418
Google Scholar
Shneiderman, B., Plaisant, C., Cohen, M. S., Jacobs, S., Elmqvist, N., & Diakopoulos, N. (2016). Designing the user interface: Strategies for effective human-computer interaction (6th ed.). Pearson. http://www.cs.umd.edu/hcil/DTUI6
Google Scholar
Sibert, L. E., & Jacob, R. J. (2000). Evaluation of eye gaze interaction [Paper presentation]. Proceedings of the SIGCHI conference on Human Factors in Computing Systems, April) (pp. 281–288). https://doi.org/10.1145/332040.332445
Google Scholar
Siegler, M. A., Stern, R. M. (1995). On the effects of speech rate in large vocabulary speech recognition systems. In 1995 International Conference on Acoustics, Speech, and Signal Processing, May (Vol. 1, pp. 612–615). IEEE.
Google Scholar
Sindhwani, S., Lutteroth, C., Weber, G. (2019). ReType: Quick text editing with keyboard and gaze. In Proceedings of the 2019 CHI Conference on Human Factors in Computing Systems, May) (pp. 1–13).
Google Scholar
Singh, S., & Singh, S. (2018). Review of real-word error detection and correction methods in text documents [Paper presentation]. 2018 Second International Conference on Electronics, Communication and Aerospace Technology (ICECA), March (pp. 1076–1081). IEEE. https://doi.org/10.1109/ICECA.2018.8474700
Google Scholar
Suhm, B., Myers, B., & Waibel, A. (2001). Multimodal error correction for speech user interfaces. ACM Transactions on Computer-Human Interaction, 8(1), 60–98. https://doi.org/10.1145/371127.371166
Google Scholar
Trewin, S., & Pain, H. (1999). Keyboard and mouse errors due to motor disabilities. International Journal of Human-Computer Studies, 50(2), 109–144. https://doi.org/10.1006/ijhc.1998.0238
Web of Science ®Google Scholar
Trewin, S., Keates, S., & Moffatt, K. (2006). Developing steady clicks: A method of cursor assistance for people with motor impairments [Paper presentation]. Proceedings of the 8th International ACM SIGACCESS Conference on Computers and Accessibility, October (pp. 26–33).
Google Scholar
Type with your voice - Google Docs Editors Help. (n.d). https://support.google.com/docs/answer/4492226?hl=en#zippy=%2Cedit-your-document
Google Scholar
Valles, M., Manso, F., Arredondo, M. T., del Pozo, F. (1996). Multimodal environmental control system for elderly and disabled people. In Proceedings of 18th Annual International Conference of the IEEE Engineering in Medicine and Biology Society, October (Vol. 2, pp. 516–517.). IEEE.
Google Scholar
Velichkovsky, B., Sprenger, A., & Unema, P. (1997). Towards gaze-mediated interaction: Collecting solutions of the “Midas touch problem. In Human-Computer Interaction INTERACT’97 (pp. 509–516). Springer.
Google Scholar
Vertanen, K., MacKay, D. J. (2010). Speech dasher: Fast writing using speech and gaze. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems, April (pp. 595–598).
Google Scholar
Wang, L., & Soong, F. K. P. (2009). U.S. Patent Application No. 12/042. 344.
Google Scholar
Wobbrock, J. O., Fogarty, J., Liu, S. Y., Kimuro, S., Harada, S. (2009). The angle mouse: Target-agnostic dynamic gain adjustment based on angular deviation. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems, April (pp. 1401–1410).
Google Scholar
Wobbrock, J. O., & Gajos, K. Z. (2008). Goal crossing with mice and trackballs for people with motor impairments: Performance, submovements, and design directions. ACM Transactions on Accessible Computing, 1(1), 1–37. https://doi.org/10.1145/1361203.1361207
Google Scholar
Woods, V., Hastings, S., Buckle, P., & Haslam, R. (2002). Ergonomics of using a mouse or other non-keyboard input device. © Health and Safety Executive.
Google Scholar
Worden, A., Walker, N., Bharat, K., Hudson, S. (1997). Making computers easier for older adults to use: Area cursors and sticky icons. In Proceedings of the ACM SIGCHI Conference on Human Factors in Computing Systems, March (pp. 266–271).
Google Scholar
Yeh, J. F., Chen, W. Y., & Su, M. C. (2015). Chinese spelling checker based on an inverted index list with a rescoring mechanism. ACM Transactions on Asian and Low-Resource Language Information Processing, 14(4), 1–28. https://doi.org/10.1145/2826235
Web of Science ®Google Scholar
Zhao, M., Cui, W., Ramakrishnan, I. V., Zhai, S., & Bi, X. (2021). Voice and touch based error-tolerant multimodal text editing and correction for smartphones [Paper presentation]. In the 34th Annual ACM Symposium on User Interface Software and Technology, October (pp. 162–178). https://doi.org/10.1145/3472749.3474742
Google Scholar
Zhao, M., Huang, H., Li, Z., Liu, R., Cui, W., Toshniwal, K., … Bi, X. (2022). EyeSayCorrect: Eye gaze and voice based hands-free text correction for mobile devices [Paper presentation]. 27th International Conference on Intelligent User Interfaces, March (pp. 470–482). https://doi.org/10.1145/3490099.3511103
Google Scholar

Share icon
Back to Top

Related research

People also read lists articles that other readers of this article have read.

Recommended articles lists articles that we recommend and is powered by our AI driven recommendation engine.

Cited by lists all citing articles based on Crossref citations.
Articles with the Crossref icon will open in a new tab.

People also read
Recommended articles
Cited by

To cite this article:

Reference style: APA Chicago Harvard

Citation copied to clipboard

Reference styles above use APA (6th edition), Chicago (16th edition) & Harvard (10th edition)

Download citation

Download a citation file in RIS format that can be imported by citation management software including EndNote, ProCite, RefWorks and Reference Manager.

Choose format: RIS BibTex RefWorks Direct Export

Choose options: Citation Citation & abstract Citation & references

Improving Error Correction and Text Editing Using Voice and Mouse Multimodal Interface

References

Information for

Open access

Opportunities

Help and information

Your download is now in progress and you may close this window

Login or register to access this feature

Improving Error Correction and Text Editing Using Voice and Mouse Multimodal Interface

References

Related research

To cite this article:

Download citation

Information for

Open access

Opportunities

Help and information

Keep up to date