252
Views
1
CrossRef citations to date
0
Altmetric
Original Articles

Opening the Knowledge Dam: Speech Recognition for Video Search

, &

References

  • Black PE. (Ed.) 2008. Levenshtein distance. Dictionary of algorithms and data structures [online], U.S. National Institute of Standards and Technology [cited 2015 Mar 24]. Available from: http://www.nist.gov/dads/HTML/Levenshtein.html.
  • Bacchiani M. 2001. Automatic transcription of voicemail at AT&T. Proceedings of the International Conference on Acoustics, Speech, and Signal Processing, (1), Salt Lake City, UT: 7–11 May 2001, pp. 25–28.
  • Besacier L, Barnard E, Karpov A, Schultz T. 2014. Automatic speech recognition for under-resourced languages: A survey. Speech Commun. 56:85–100.
  • Biadsy F, Moreno PJ, Jansche M. 2012. Google’s cross-dialect Arabic voice search. IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP 2012), pp. 4441–4444.
  • Brecht H, Ogilby S. 2008. Enabling a comprehensive teaching strategy: Video lectures. J Inf Technol Educ. 7(1):71–86.
  • Proceedings of the Workshop CCURL 2014 – Collaboration and Computing for Under-Resourced Languages in the Linked Open Data Era. 2014. “Overview,” Available from: http://www.ilc.cnr.it/ccurl2014/
  • Cournoyer B. 2014. YouTube SEO best practices: Titles and descriptions. [eBook Excerpt]. Available from: http://www.brainshark.com/Ideas-Blog/2014/March/youtube-seo-best-practices-titles-descriptions.aspx.
  • Debevc M, Peljhan Z. 2004. The role of video technology in on-line lectures for the deaf. Disability Rehabil. 26(17):1048–1059.
  • Eklund R. 2012. ASR “sweet sixteen”: An evaluation of Nuance Swedish speech recognizer success rates in 69 commercial applications 16 years after its inception and an assessment of inter- and intralabeler agreement. Proceedings of FONETIK 2012, pp. 113–116.
  • Furui S. 2012. Automatic speech recognition: Trials, tribulations and triumphs. Keynote lecture at the Speech Processing Conference, Afeka Tel Aviv Academic College of Engineering, Tel-Aviv, June 19–20, 2012. Available from: http://events.eventact.com/afeka/aclp2012/ASR%20Trials%20Tribulations%20and%20Triumphs.pdf.
  • Gartner’s 2014. Hype Cycle. Available from: http://www.gartner.com/newsroom/id/2819918.
  • Geri N, Gafni R, Winer A. 2014. The u-curve of e-learning: course website and online video use in blended and distance learning. Interdiscip J E-Learning and Learn Objects 10:1–16. Available from: http://www.ijello.org/Volume10/IJELLOv10p001-016Geri0473.pdf.
  • Google support, Automatic Caption, 2015. Available from: https://support.google.com/youtube/answer/3038280?hl=en.
  • Jain U, Siegler MA, Doh SJ, Gouvea E, Huerta J, Moreno PJ, ... and Stern RM. 1996. Recognition of continuous broadcast news with multiple unknown speakers and environments. In Proceedings of the 1996 ARPA Speech Recognition Workshop.
  • Kaltura webinar. 2015. Beyond Video Accessibility: Revamp Academic Performance with Searchable Captions.
  • Lamel L, Adda G. 1996. On designing pronunciation lexicons for large vocabulary, continuous speech recognition. Fourth International Conference on Spoken Language, ICSLP 96. 1996, 6–9.
  • Linden A, Fenn J. 2003. Understanding Gartner’s hype cycles. Strategic Analysis Report No. R-20-1971. Gartner, Inc.
  • McCowan I, Moore D, Dines J, Gatica-Perez D, Flynn M, Wellner P, Bourlard H. On the use of information retrieval measures for speech recognition evaluation. IDIAP (Institut Dalle Molle d’Intelligence Artificielle Perceptive), Martigny, Switzerland. IDIAP Research Report IDIAP-RR 04-73, March 2005. Available from: http://www.idiap.ch/ftp/reports/2004/rr04-73.pdf
  • Multilingual Europe Technology Alliance. 2010 –2012. META-NET White Paper Series: Key Results and Cross-Language Comparison.
  • Moyal A, Aharonson V, Tetariy E, Gishri M. 2013. Phonetic search methods for large speech databases. New York: Springer.
  • NIST (National Institute of Standards and Technology), The History of Automatic Speech Recognition Evaluations at NIST, 2009. Available from: http://itl.nist.gov/iad/mig/publications/ASRhistory/index.html.
  • Optical Character Recognition. (n.d.). In Wikipedia. [cited 2015 January 14]. Available from: http://en.wikipedia.org/wiki/Optical_character_recognition.
  • Pappano L. 2012. The year of the MOOC. The New York Times, 2(12), 2012. Available from: http://edinaschools.org/cms/lib07/MN01909547/Centricity/Domain/272/The%20Year%20of%20the%20MOOC%20NY%20Times.pdf.
  • Paul DL, Pearlson KE, McDaniel Jr RR. 1999. Assessing technological barriers to telemedicine: Technology-management implications. Eng Manage IEEE Trans. 46(3):279–288.
  • “Plagiarism,” Published on Sep. 10, 2014. Available from: http://youtu.be/tZ0cOp7buRg.
  • Ranchal R, Taber-Doughty T, Guo Y, Bain K, Martin H, Robinson J, Duerstock B. 2013. Using speech recognition for real-time captioning and lecture transcription in the classroom. IEEE Trans Learn Technol. 6(4):299–311.
  • Raj B, Parikh VN, Stern RM. 1997. The effects of background music on speech recognition accuracy. In Acoustics, Speech, and Signal Processing, 1997. ICASSP-97, 1997 IEEE International Conference on, 1997, 851–854.
  • Repp S, Groß A, Meinel C. 2008. Browsing within lecture videos based on the chain index of speech transcription. Learn Technol IEEE Trans on 1(3):145–156.
  • Schantz HF. 1982. The history of OCR, optical character recognition, [Manchester Center, Vt.]: Recognition Technologies Users Association.
  • Silber-Varod V, Geri N. 2014a. Can automatic speech recognition be satisficing for audio/video search? Keyword-focused analysis of Hebrew automatic and manual transcription. Online J Appl Knowledge Manage. 2(1):104–121.
  • Silber-Varod V, Geri N. 2014b. Error diagnosis and classification of errors in two Hebrew state-of-the-art speech recognition systems. Proceedings of 2014 Afeka Speech Processing Conference, Tel-Aviv.
  • Stamford C. 2014. Gartner’s 2014 hype cycle for emerging technologies maps the journey to digital business. [cited 2014 Aug 11]. Available from: http://www.gartner.com/newsroom/id/2819918.
  • Tetariy E, Lotner N, Gishri M, Moyal A. 2014. A hybrid keyword spotting approach combining LVCSR and phonetic search. In Proceedings 2014 Speech Processing Conference, Tel-Aviv, Israel July 7–8, 2014.
  • Wrigley SN, Hain T. 2011. Web-based automatic speech recognition service-webASR. In INTERSPEECH2011, 3265–3268.
  • Yang H, Grünewald F, Bauer M, Meinel C. 2013. Lecture video browsing using multimodal information resources. Advances in Web-Based Learning–ICWL 2013. Berlin, Heidelberg: Springer, 204–213.
  • Yang H, Meinel C. 2014. Content based lecture video retrieval using speech and video text information. IEEE Trans Learn Technol. 7(2):142–154.
  • Zue V, Cole R. 1996. Spoken language input. In: Cole RA, Mariani J, Uszkoreit H, Zaenen A, Zue V. (Eds.) Survey of the state of the art in human language technology. Vol. 13. Cambridge: Cambridge University Press.

Reprints and Corporate Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

To request a reprint or corporate permissions for this article, please click on the relevant link below:

Academic Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

Obtain permissions instantly via Rightslink by clicking on the button below:

If you are unable to obtain permissions via Rightslink, please complete and submit this Permissions form. For more information, please visit our Permissions help page.