Search in:

Advanced search

IETE Journal of Research Volume 69, 2023 - Issue 7

Submit an article Journal homepage

131

Views

CrossRef citations to date

Altmetric

Communications

Efficient Keyword Spotting System Using Deformable Convolutional Network

Huu Binh NguyenSchool of Electrical Engineering, Hanoi University of Science and Technology, Hanoi, VietnamView further author information

Van Hai DuongSchool of Electrical Engineering, Hanoi University of Science and Technology, Hanoi, VietnamView further author information

Anh Xuan Tran ThiSchool of Electrical Engineering, Hanoi University of Science and Technology, Hanoi, VietnamView further author information

Quoc Cuong NguyenSchool of Electrical Engineering, Hanoi University of Science and Technology, Hanoi, Vietnam

https://orcid.org/0000-0002-5362-2968 View further author information

Pages 4196-4204 | Published online: 04 Jul 2021

Cite this article
https://doi.org/10.1080/03772063.2021.1946438
CrossMark

Full Article
Figures & data
References
Citations
Metrics
Reprints & Permissions

REFERENCES

J. R. Rohlicek, W. Russell, S. Roukos, and H. Gish, “Continuous hidden Markov modeling for speaker-independent word spotting,” in International Conference on Acoustics, Speech, and Signal Processing (ICASSP), Glasgow, UK, 1989.
Google Scholar
D. Can, and M. Saraclar, “Lattice indexing for spoken term detection,” IEEE Trans. Audio Speech Lang. Process., Vol. 19, pp. 2338–47, 2011.
Google Scholar
G. Chen, C. Parada, and G. Heigold, “Small-footprint keyword spotting using deep neural networks,” in 2014 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Florence, Italy, 2014.
Google Scholar
A. Coucke, M. Chlieh, T. Gisselbrecht, D. Leroy, M. Poumeyrol, and T. Lavril, “Efficient keyword spotting using dilated convolutions and gating,” in 2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Brighton, UK, 2019.
Google Scholar
J. Dai, H. Qi, Y. Xiong, Y. Li, G. Zhang, H. Hu, and Y. Wei, “Deformable convolutional networks,” in Proceedings of the IEEE International Conference on Computer Vision (ICCV), Venice, Italy, 2017.
Google Scholar
C. Shan, J. Zhang, Y. Wang, and L. Xie, “Attention-based end-to-end models for small-footprint keyword spotting,” in Interspeech 2018, Hyderabad, India, 2018.
Google Scholar
S. O. Arik, M. Kliegl, R. Child, J. Hestness, A. Gibiansky, C. Fougner, R. Prenger, and A. Coates, “Convolutional recurrent neural networks for small-footprint keyword spotting,” in Interspeech 2017, Stockholm, Sweden, 2017.
Google Scholar
S. Hochreiter, and J. Schmidhuber, “Long short-term memory,” Neural Comput., Vol. 9, pp. 1735–80, 1997.
PubMed Web of Science ®Google Scholar
K. Cho, B. van Merrienboer, C. Gulcehre, F. Bougares, H. Schwenk, and Y. Bengio, “Learning phrase representations using RNN encoder-decoder for statistical machine translation,” arXiv preprint arXiv:1406.1078, 2014.
Google Scholar
D. Bahdanau, J. Chorowski, D. Serdyuk, P. Brakel, and Y. Bengio, “End-to-end attention-based large vocabulary speech recognition,” in 2016 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Shanghai, China, 2016.
Google Scholar
W. Chan, N. Jaitly, Q. Le and, and O. Vinyals, “Listen, attend and spell: A neural network for large vocabulary conversational speech recognition,” in 2016 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Shanghai, China, 2016.
Google Scholar
C. Shan, J. Zhang, Y. Wang, and L. Xie, “Attention-based end-to-end speech recognition on voice search,” in 2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Calgary, AB, Canada, 2018.
Google Scholar
F. A. R. R. Chowdhury, Q. Wang, I. L. Moreno, and L. Wan, “Attention-based models for text-dependent speaker verification,” in 2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Calgary, AB, Canada, 2018.
Google Scholar
D. Snyder, G. Chen, and D. Povey, “MUSAN: A music, speech, and noise corpus,” arXiv preprint arXiv:1510.08484, 2, 2015.
Google Scholar
A. van den Oord, S. Dieleman, H. Zen, K. Simonyan, O. Vinyals, A. Graves, N. Kalchbrenner, A. W. Senior, and K. Kavukcuoglu, “WaveNet: A generative model for raw audio,” arXiv preprint arXiv:1609.03499, 2016.
Google Scholar
D. Kingma, and J. Ba, “Adam: A method for stochastic optimization,” arXiv preprint arXiv:1412.6980, 2014.
Google Scholar

Reprints and Corporate Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

To request a reprint or corporate permissions for this article, please click on the relevant link below:

Order Reprints Request Corporate Permissions

Academic Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

Obtain permissions instantly via Rightslink by clicking on the button below:

Request Academic Permissions

If you are unable to obtain permissions via Rightslink, please complete and submit this Permissions form. For more information, please visit our Permissions help page.

Share icon
Back to Top

Related research

People also read lists articles that other readers of this article have read.

Recommended articles lists articles that we recommend and is powered by our AI driven recommendation engine.

Cited by lists all citing articles based on Crossref citations.
Articles with the Crossref icon will open in a new tab.

People also read
Recommended articles
Cited by

To cite this article:

Reference style: APA Chicago Harvard

Citation copied to clipboard

Reference styles above use APA (6th edition), Chicago (16th edition) & Harvard (10th edition)

Download citation

Download a citation file in RIS format that can be imported by citation management software including EndNote, ProCite, RefWorks and Reference Manager.

Choose format: RIS BibTex RefWorks Direct Export

Choose options: Citation Citation & abstract Citation & references

Efficient Keyword Spotting System Using Deformable Convolutional Network

REFERENCES

Information for

Open access

Opportunities

Help and information

Your download is now in progress and you may close this window

Login or register to access this feature

Efficient Keyword Spotting System Using Deformable Convolutional Network

REFERENCES

Reprints and Corporate Permissions

Academic Permissions

Related research

To cite this article:

Download citation

Information for

Open access

Opportunities

Help and information

Keep up to date