Search in:

Advanced search

Journal of Management Information Systems Volume 22, 2006 - Issue 4

Journal homepage

Views

CrossRef citations to date

Altmetric

Original Article

Discovering Cues to Error Detection in Speech Recognition Output: A User-Centered Approach

LINA ZHOU Baltimore County

YONGMEI SHI Baltimore County

DONGSONG ZHANG Baltimore County

ANDREW SEARS Baltimore County

Pages 237-270 | Published online: 08 Dec 2014

Cite this article
https://doi.org/10.2753/MIS0742-1222220409

References
Citations
Metrics
Reprints & Permissions

References

Anderberg, M.R. Cluster Analysis for Applications. New York: Academic Press, 1973.
Google Scholar
Arnold, S.C.; Mark, L.; and Goldthwaite, J. Programming by voice, VocalProgramming. In M. Tremaine, E. Cole, and E. Mynatt (eds.), Proceedings of the Fourth International ACM Conference on Assistive Technologies. New York: ACM Press, 2000, pp. 149-155.
Google Scholar
Bain, K.; Basson, S.H.; and Wald, M. Speech recognition in university classrooms: Liberated learning project. In V.L. Hanson and J.A. Jacko (eds.), Proceedings of the Fifth International ACM Conference on Assistive Technologies. New York: ACM Press, 2002, pp. 192-196.
Google Scholar
Brill, E.; Florian, R.; Henderson, J.C.; and Mangu, L. Beyond n-grams: Can linguistic sophistication improve language modeling? In C. Boitet and P. Whitelock (eds.), Proceedings of the Thirty-Sixth Annual Meeting on Association for Computational Linguistics. Morristown, NJ: Association for Computational Linguistics, 1998, pp. 186-190.
Google Scholar
Carpenter, P.; Jin, C.; Wilson, D.; Zhang, R.; Bohus, D.; and Rudnicky, A.I. Is this conversation on track? In P. Dalsgaard, B. Lindberg, H. Benner, and Z. Tan (eds.), Proceedings of the Seventh European Conference on Speech Communication and Technology. Bonn, Germany: International Speech Communication Association, 2001, pp. 2121-2124.
Google Scholar
Chase, L. Error-Responsive Feedback Mechanisms for Speech Recognizers. Ph.D. dissertation, School of Computer Science, Carnegie Mellon University, Pittsburgh, PA, 1997.
Google Scholar
Chase, L. Word and acoustic confidence annotation for large vocabulary speech recognition. In G. Kokkinakis, N. Fakotakis, and E. Dermatas (eds.), Proceedings of the Fifth European Conference on Speech Communication and Technology. Bonn, Germany: International Speech Communication Association, 1997, pp. 815-818.
Google Scholar
Deng, L., and Huang, X. Challenges in adopting speech recognition. Communications of the ACM, 47, 1 (January 2004), 69-75.
Google Scholar
Duchateau, J.; Demuynck, K.; and Wambacq, P. Confidence scoring based on backward language models. In F.J. Taylor, J. Principe, and H. Bourlard (eds.), 2002 IEEE International Conference on Acoustics, Speech, and Signal Processing, vol. 1. Los Alamitos, CA: IEEE Computer Society Press, 2002, pp. 221-224.
Google Scholar
Ein-Dor, P., and Spiegler, I. Natural language access to multiple databases: A model and a prototype. Journal of Management Information Systems, 12, 1 (Summer 1995), 171-197.
Google Scholar
Ericsson, K.A., and Simon, H.A. Protocol Analysis: Verbal Reports as Data. Cambridge, MA: MIT Press, 1993.
Google Scholar
Feng, J., and Sears, A. Using confidence scores to improve hands-free speech based navigation in continuous dictation systems. ACM Transactions on Computer-Human Interaction, 11, 4 (December 2004), 329-356.
Google Scholar
Furui, S. Automatic speech recognition and its application to information extraction. In R. Dale and K. Church (eds.), Proceedings of the Thirty-Seventh Annual Meeting of the Association for Computational Linguistics. Morristown, NJ: Association for Computational Linguistics, 1999, pp. 11-20.
Google Scholar
Gauvain, J.-L., and Lamel, L. Large vocabulary speech recognition based on statistical methods. In W. Chou and B.H. Juang (eds.), Pattern Recognition in Speech and Language Processing. Boca Raton, FL: CRC Press, 2003, pp. 149-189.
Google Scholar
Gillick, L.; Ito, Y.; and Young, J. A probabilistic approach to confidence estimation and evaluation. In M.K. Lang and H. Hoge (eds.), 1997 IEEE International Conference on Acoustics, Speech, and Signal Processing, vol. 2. Los Alamitos, CA: IEEE Computer Society Press, 1997, pp. 879-882.
Google Scholar
Hagen, A.; Connors, D.A.; and Pellom, B.L. The analysis and design of architecture systems for speech recognition on modern handheld-computing devices. In R. Gupta and Y. Nakamura (eds.), Proceedings of the First IEEE/ACM/IFIP International Conference on Hard-ware/Software Codesign and System Synthesis. New York: ACM Press, 2003, pp. 65-70.
Google Scholar
Hernandez-Abrego, G., and Marino, J.B. Contextual confidence measures for continuous speech recognition. In H. Abut and L. Onural (eds.), 2000 IEEE International Conference on Acoustics, Speech, and Signal Processing, vol. 3. Los Alamitos, CA: IEEE Computer Society Press, 2000, pp. 1803-1806.
Google Scholar
Higgins, E.L., and Raskind, M.H. Speaking to read: The effects of continuous vs. discrete speech recognition systems on the reading and spelling of children with learning disabilities. Journal of Special Education Technology, 15, 1 (Winter 2000) (available at jset.unlv.edu/ 15.1/higgins/first.html).
Google Scholar
Hoffman, T. Speech recognition powers utility's customer service. ComputerWorld, September 12, 2005 (available at www.computerworld.com/managementtopics/management/ helpdesk/story/0,10801,104535,00.html).
Google Scholar
Kemp, T., and Schaaf, T. Estimating confidence using word lattices. In G. Kokkinakis, N. Fakotakis, and E. Dermatas (eds.), Proceedings of the Fifth European Conference on Speech Communication and Technology. Bonn, Germany: International Speech Communication Association, 1997, pp. 827-830.
Google Scholar
Krahmer, E.; Swerts, M.; Theune, M.; and Weegels, M. Error detection in spoken human- machine interaction. International Journal of Speech Technology, 4, 1 (March 2001), 19-30.
Google Scholar
Lai, J., and Vergo, J. MedSpeak: Report creation with continuous speech recognition. In S. Pemberton (ed.), Proceedings of the SIGCHI Conference on Human Factors in Computing Systems. New York: ACM Press, 1997, pp. 431-438.
Google Scholar
Levine, H.G., and Rossmoore, D. Diagnosing the human threats to information technology implementation: A missing factor in systems analysis illustrated in a case study. Journal of Management Information Systems, 10, 2 (Fall 1993), 55-74.
Google Scholar
Liu, Y. Structural Event Detection for Rich Transcription of Speech. Ph.D. dissertation, School of Electrical and Computer Engineering, Purdue University, West Lafayette, IN, 2004.
Google Scholar
Lubert, J.; Kotler, A.; Shein, F.; and Tam, C. Speech recognition. SNOW, Toronto, ON, 1998 (available at snow.utoronto.ca/best/special/speechrecognition.html).
Google Scholar
Maison, B., and Gopinath, R. Robust confidence annotation and rejection for continuous speech recognition. In V.J. Mathews and A. Swindlehurst (eds.), 2001 IEEE International Conference on Acoustics, Speech, and Signal Processing, vol. 1. Los Alamitos, CA: IEEE Computer Society Press, 2001, pp. 389-392.
Google Scholar
Mangu, L., and Padmanabhan, M. Error corrective mechanisms for speech recognition. In V.J. Mathews and A. Swindlehurst (eds.), 2001 IEEE International Conference on Acoustics, Speech, and Signal Processing, vol. 1. Los Alamitos, CA: IEEE Computer Society Press, 2001, pp. 29-32.
Google Scholar
Mann, W.C., and Thompson, S.A. Rhetorical structure theory: A theory of text organization. In L. Polanyi (ed.), The Structure of Discourse. Norwood, NJ: Ablex, 1987, pp. 85-96.
Google Scholar
Mao, J.-Y., and Benbasat, I. The use of explanations in knowledge-based systems: Cognitive perspectives and a process-tracing analysis. Journal of Management Information Systems, 17, 2 (Fall 2000), 153-180.
Google Scholar
McTear, M.F. Spoken dialogue technology: Enabling the conversational user interface. ACM Computing Surveys, 34, 1 (March 2002), 90-169.
Google Scholar
Nunamaker, J.F., Jr.; Konsynski, B.R.; Chen, M.; Vinze, A.S.; King, D.R.; and Heltne, M.M. Knowledge-based systems support for information centers. Journal of Management Information Systems, 5, 1 (Summer 1988), 6-24.
Google Scholar
Pao, C.; Schmid, P.; and Glass, J. Confidence scoring for speech understanding systems. In R.H. Mannell and J. Robert-Ribes (eds.), Proceedings of the Fifth International Conference on Spoken Language Processing. Canberra: Australian Speech Science and Technology Association, 1998, pp. 815-818.
Google Scholar
Pradhan, S.S., and Ward, W.H. Estimating semantic confidence for spoken dialogue systems. In F.J. Taylor, J. Principe, and H. Bourlard (eds.), 2002 IEEE International Conference on Acoustics, Speech, and Signal Processing, vol. 1. Los Alamitos, CA: IEEE Computer Society Press, 2002, pp. 233-236.
Google Scholar
Ringger, E.K., and Allen, J.F. Error correction via a post-processor for continuous speech recognition. In M.H. Hayes and M.A. Clements (eds.), 1996 IEEE International Conference on Acoustics, Speech, and Signal Processing, vol. 1. Los Alamitos, CA: IEEE Computer Society Press, 1996, pp. 427-430.
Google Scholar
Robertson, J.; Wong, W.Y.; Chung, C.; and Kim, D.K. Automatic speech recognition for generalised time based media retrieval and indexing. In W. Effelsberg and B.C. Smith (eds.), Proceedings of the Sixth ACM International Conference on Multimedia. New York: ACM Press, 1998, pp. 241-246.
Google Scholar
San-Segundo, R.; Pellom, B.; Hacioglu, K.; Ward, W.; and Pardo, J.M. Confidence measures for spoken dialogue systems. In V.J. Mathews and A. Swindlehurst (eds.), 2001 IEEE International Conference on Acoustics, Speech, and Signal Processing, vol. 1. Los Alamitos, CA: IEEE Computer Society Press, 2001, pp. 393-396.
Google Scholar
Sarikaya, R.; Gao, Y.; and Picheny, M. Word level confidence measurement using semantic features. In W. Siu, A.G. Constantinides, and Y. Chan (eds.), 2003 IEEE International Conference on Acoustics, Speech, and Signal Processing, vol. 1. Los Alamitos, CA: IEEE Computer Society Press, 2003, pp. 604-607.
Google Scholar
Sarma, A., and Palmer, D.D. Context-based speech recognition error detection and correction. In J.B. Hirschberg, S. Dumais, D. Marcu, and S. Roukos (eds.), Human Language Technology Conference of the North American Chapter of the Association for Computational Linguistics 2004: Short Papers. East Stroudsburg, PA: Association for Computational Linguistics, 2004, pp. 85-88.
Google Scholar
Sears, A.; Feng, J.; Oseitutu, K.; and Karat, C.-M. Hands-free speech-based navigation during dictation: Difficulties, consequences, and solutions. Human-Computer Interaction, 18, 3 (2003), 229-257.
Google Scholar
Sears, A.; Karat, C.-M.; Oseitutu, K.; Karimullah, A.; and Feng, J. Productivity, satisfaction, and interaction strategies of individuals with spinal cord injuries and traditional users interacting with speech recognition software. Universal Access in the Information Society, 1, 1 (June 2001), 4-15.
Google Scholar
Skantze, G., and Edlund, J. Early error detection on word level. In B. Milner (ed.), Proceedings of COST278 and ISCA Tutorial and Research Workshop on Robustness Issues in Conversational Interaction. Bonn, Germany: International Speech Communication Association, 2004 (available at www.isca-speech.org/archive/robust2004/rob4_17.html).
Google Scholar
Suhm, B.; Myers, B.; and Waibel, A. Multimodal error correction for speech user interfaces. ACM Transactions on Computer-Human Interaction, 8, 1 (March 2001), 60-98.
Google Scholar
Weintraub, M.; Beaufays, F.; Rivlin, Z.; Konig, Y.; and Stolcke, A. Neural-network based measures of confidence for word recognition. In M.K. Lang and H. Hoge (eds.), 1997 IEEE International Conference on Acoustics, Speech, and Signal Processing, vol. 2. Los Alamitos, CA: IEEE Computer Society Press, 1997, pp. 887-890.
Google Scholar
Wendemuth, A.; Rose, G.; and Dolfing, J.G.A. Advances in confidence measures for large vocabulary. In D. Cochran and A. Spanias (eds.), 1999 IEEE International Conference on Acoustics, Speech, and Signal Processing, vol. 2. Los Alamitos, CA: IEEE Computer Society Press, 1999, pp. 705-708.
Google Scholar
Wessel, F.; Schluter, R.; and Ney, H. Using posterior probabilities for improved speech recognition. In H. Abut and L. Onural (eds.), 2000 IEEE International Conference on Acoustics, Speech, and Signal Processing, vol. 3. Los Alamitos, CA: IEEE Computer Society Press, 2000, pp. 1587-1590.
Google Scholar
Wessel, F.; Schluter, R.; Macherey, K.; and Ney, H. Confidence measures for large vocabulary continuous speech recognition. IEEE Transactions on Speech and Audio Processing, 9, 3 (March 2001), 288-298.
Google Scholar
Young, S.R. Detecting misrecognitions and out-of-vocabulary words. In 1994 IEEE International Conference on Acoustics, Speech, and Signal Processing, vol. 2. Los Alamitos, CA: IEEE Computer Society Press, 1994, pp. 21-24.
Google Scholar
Zhang, D., and Adipat, B. Challenges, methodologies, and issues in the usability testing of mobile applications. International Journal of Human-Computer Interaction, 18, 3 (July 2005), 293-308.
Google Scholar
Zhang, R., and Rudnicky, A.I. Word level confidence annotation using combinations of features. In P. Dalsgaard, B. Lindberg, H. Benner, and Z. Tan (eds.), Proceedings of the Seventh European Conference on Speech Communication and Technology. Bonn, Germany: International Speech Communication Association, 2001, pp. 2105-2108.
Google Scholar
Zhou, L.; Shi, Y.; Feng, J.; and Sears, A. Data mining for detecting errors in dictation speech recognition. IEEE Transactions on Speech and Audio Processing, 13, 5 (September 2005), 681-688.
Google Scholar
Zhou, Z., and Meng, H. A two-level schema for detecting recognition errors. In S.H. Kim, S. Lee, Y. Oh, and Y. Lee (eds.), Proceedings of the Eighth International Conference on Spoken Language Processing. Bonn, Germany: International Speech Communication Association, 2004, pp. 449-452.
Google Scholar

Reprints and Corporate Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

To request a reprint or corporate permissions for this article, please click on the relevant link below:

Order Reprints Request Corporate Permissions

Academic Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

Obtain permissions instantly via Rightslink by clicking on the button below:

Request Academic Permissions

If you are unable to obtain permissions via Rightslink, please complete and submit this Permissions form. For more information, please visit our Permissions help page.

Related research

People also read lists articles that other readers of this article have read.

Recommended articles lists articles that we recommend and is powered by our AI driven recommendation engine.

Cited by lists all citing articles based on Crossref citations.
Articles with the Crossref icon will open in a new tab.

People also read
Recommended articles
Cited by

To cite this article:

Reference style: APA Chicago Harvard

Citation copied to clipboard

Reference styles above use APA (6th edition), Chicago (16th edition) & Harvard (10th edition)

Download citation

Download a citation file in RIS format that can be imported by citation management software including EndNote, ProCite, RefWorks and Reference Manager.

Choose format: RIS BibTex RefWorks Direct Export

Choose options: Citation Citation & abstract Citation & references

Discovering Cues to Error Detection in Speech Recognition Output: A User-Centered Approach

References

Information for

Open access

Opportunities

Help and information

Your download is now in progress and you may close this window

Login or register to access this feature

Discovering Cues to Error Detection in Speech Recognition Output: A User-Centered Approach

References

Reprints and Corporate Permissions

Academic Permissions

Related research

To cite this article:

Download citation

Information for

Open access

Opportunities

Help and information

Keep up to date