Search in:

Advanced search

Cogent Engineering Volume 7, 2020 - Issue 1

Submit an article Journal homepage

Open access

3,265

Views

CrossRef citations to date

Altmetric

COMPUTER SCIENCE

Neural architectures for gender detection and speaker identification

Orken Mamyrbayev1 Institute of Information and Computational Technologies, Almaty, Kazakhstan;2 al-Farabi Kazakh National University, Almaty, KazakhstanCorrespondence[email protected]

https://orcid.org/0000-0001-8318-3794 View further author information

Alymzhan Toleu1 Institute of Information and Computational Technologies, Almaty, Kazakhstan

https://orcid.org/0000-0001-9246-319X View further author information

Gulmira Tolegen1 Institute of Information and Computational Technologies, Almaty, KazakhstanView further author information

Nurbapa Mekebayev1 Institute of Information and Computational Technologies, Almaty, Kazakhstan;2 al-Farabi Kazakh National University, Almaty, KazakhstanView further author information

Duc Pham3 School of Mechanical Engineering, University of Birmingham, Birmingham, UKView further author information

(Reviewing editor)

Article: 1727168 | Received 22 Oct 2019, Accepted 30 Jan 2020, Published online: 11 Feb 2020

Cite this article
https://doi.org/10.1080/23311916.2020.1727168
CrossMark

Full Article
Figures & data
References
Citations
Metrics
Licensing
Reprints & Permissions
View PDF PDF View EPUB EPUB

References

Auer, P., Burgsteiner, H., & Maass, W. (2008, Jun). A learning rule for very simple universal approximators consisting of a single layer of perceptrons. Neural Networks: the Official Journal of the International Neural Network Society, 21(5), 786–13. doi:10.1016/j.neunet.2007.12.036
PubMed Web of Science ®Google Scholar
Collobert, R., & Weston, J.: A unified architecture for natural language processing: Deep neural networks with multitask learning. In: Proceedings of the 25th International Conference on Machine Learning, Helsenki, Finland. pp. 160–167. ICML’08, ACM, New York, NY (2008).
Google Scholar
Cunningham, P., & Delany, S. (2007, April). k-nearest neighbour classifiers. Multiple Classifier System, 1–17.
Google Scholar
Darwiche, A. (2010, December). Bayesian networks. Communications of the ACM, 53(12), 80–90. doi:10.1145/1859204
Web of Science ®Google Scholar
Dehak, N., Kenny, P. J., Dehak, R., Dumouchel, P., & Ouellet, P. (2010). Front-end factor analysis for speaker verification. IEEE Transactions on Audio, Speech and Language Processing. \ 19(4), 1–11.
Google Scholar
Du, S. S., Lee, J. D., Li, H., Wang, L., & Zhai, X. (2018). Gradient descent nds global minima of deep neural networks. Proceedings of the 36th International Conference on Machine Learning, Long Beach: California, 1–45. CoRR abs/1811.03804.
Google Scholar
Freund, Y., & Schapire, R. E. (1999, Dec). Large margin classification using the perceptron algorithm. Machine Learning, 37(3), 277–296. doi:10.1023/A:1007662407062
Web of Science ®Google Scholar
Girdhar, R., Carreira, J., Doersch, C., & Zisserman, A.: Video action transformer network. In: The IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Long Beach: California (June 2019).
Google Scholar
Ho, T. K.: Random decision forests. In: Proceedings of the Third International Conference on Document Analysis and Recognition. 1, 278–283. ICDAR’95, Montreal, Quebec, Canada, IEEE Computer Society, Washington, DC(1995).
Google Scholar
Krizhevsky, A., Sutskever, I., & Hinton, G. E. (2012). Imagenet classification with deep convolutional neural networks. In F. Pereira, C. J. C. Burges, L. Bottou, & K. Q. Weinberger (Eds.), Advances in neural information processing systems, Massachusetts Institute of Technology Press, 25, 1097–1105. Curran Associates, Inc.
Google Scholar
Lee, H. S., Tsao, Y., Wang, H. M., & Jeng, S. K.: Clustering-based i-vector formulation for speaker recognition. Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH, Singapore, 1101–1105 (January 2014).
Google Scholar
Li, C., Ma, X., Jiang, B., Li, X., Zhang, X., Liu, X., … Zhu, Z. (2017). Deep speaker: An end-to-end neural speaker embedding system. Published in ArXiv 2017, CoRR abs/1705.02304.
Google Scholar
Li, L., Chen, Y., Shi, Y., Tang, Z., & Wang, D. (2017). Deep speaker feature learning for text-independent speaker verification. Published in ArXiv 2017, CoRR abs/1705.03670.
Google Scholar
Lian, H. C., & Lu, B. L. (2006). Multi-view gender classification using local binary patterns and support vector machines. In J. Wang, Z. Yi, J. M. Zurada, B. L. Lu, & H. Yin (Eds.), Advances in neural networks - ISNN 2006 (pp. 202–209). Heidelberg: Springer Berlin Heidelberg, Berlin.
Google Scholar
Liew, S. S., Khalil-Hani, M., Radzi, F., & Bakhteri, R. (2016, March). Gender classification: A convolutional neural network approach. Turkish Journal of Electrical Engineering and Computer Sciences, 24, 1248–1264. doi:10.3906/elk-1311-58
Web of Science ®Google Scholar
Mamyrbayev, O., Turdalyuly, M., Mekebayev, N., Alimhan, K., Kydyrbekova, A., & Turdalykyzy, T. (2019). Automatic recognition of kazakh speech using deep neural net-works. In N. T. Nguyen, F. L. Gaol, T. P. Hong, & B. Trawinski (Eds.), Intelligent information and database systems (pp. 465–474). Cham: Springer International Publishing.
Google Scholar
Mamyrbayev, O., Turdalyuly, M., Mekebayev, N., Mukhsina, K., Alimhan, K., BabaAli, B., … Akhmetov, B. (2019, January). Continuous speech recognition of kazakh language. ITM Web of Conferences, 24, 01012. doi:10.1051/itmconf/20192401012
Google Scholar
Naeem, M., Khan, A., Qureshi, S. A., Riaz, N., Zul Kar, S., & Bhutto, A. R. (2013). Gender classification with decision trees. International Journal of Signal Processing, Image Processing and Pattern Recognition 6(1), 165–176. February, 2013
Google Scholar
Pang, B., Zha, K., Cao, H., Shi, C., & Lu, C.: Deep rnn framework for visual sequential applications. In: The IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Long Beach: California, 423–432. (June 2019).
Google Scholar
Qu, M., Bengio, Y., & Tang, J. (2019). GMNN: Graph markov neural networks. Proceedings of the 36th International Conference on Machine Learning, Long Beach, California, PMLR 97, 2019. CoRR abs/1905.06214.
Google Scholar
Reynolds, D. A., Quatieri, T. F., & Dunn, R. B. (2000). Speaker verification using adapted gaussian mixture models. In Digital signal processing, Academic Press, London, UK, 19–41 (2000) doi:10.1006/dspr.1999.0361
Google Scholar
Sahidullah, M., & Saha, G. (2012, May). Design, analysis and experimental evaluation of block based transformation in mfcc computation for speaker recognition. Speech Communication, 54, 543–565. doi:10.1016/j.specom.2011.11.004
Web of Science ®Google Scholar
Schmidhuber, J. (2015). Deep learning in neural networks: An overview. Neural Networks, Manno-Lugano: Switzerland, Published by Elsevier Ltd, 61, 85–117. published online 2014; based on TR arXiv:1404.7828 [cs.NE].
PubMed Web of Science ®Google Scholar
Toleu, A., Tolegen, G., & Makazhanov, A.: Character-aware neural morphological disambiguation. In: Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics 2, 666–671. Association for Computational Linguistics, Vancouver, Canada (July 2017a).
Google Scholar
Toleu, A., Tolegen, G., & Makazhanov, A.: Character-based deep learning models for token and sentence segmentation. In: Conference: 5th International Conference on Turkic Languages Processing (TurkLang 2017). Kazan, Tatarstan, Russian Feder-ation (October 2017b).
Google Scholar
Tur, G., Hakkani-tur, D., & Oazer, K. (2003, June). A statistical information extraction system for turkish. Natural Language Engineering, 9(2), 181–210. doi:10.1017/S135132490200284X
Google Scholar
Van den Oord, A., Dieleman, S., & Schrauwen, B. (2013). Deep content-based music recommendation. In C. J. C. Burges, L. Bottou, M. Welling, Z. Ghahramani, & K. Q. Weinberger (Eds.), Advances in neural information processing systems . 26, 2643–2651. Curran Associates, Inc.
Google Scholar
Villalba, J., Brummer, N., & Dehak, N. (2017, August). Tied variational autoencoder backends for i-vector speaker recognition, INTERSPEECH 2017, August 20–24, 2017, Stockholm, Sweden, 1004–1008.
Google Scholar
Youse, M., Youse, M., Fathi, M., & Fogliatto, F. (2019, October). Patient visit forecasting in an emergency department using a deep neural network approach. Kybernetes Ahead-of-print, 46, 643–651.
Google Scholar

Share icon
Back to Top

Related research

People also read lists articles that other readers of this article have read.

Recommended articles lists articles that we recommend and is powered by our AI driven recommendation engine.

Cited by lists all citing articles based on Crossref citations.
Articles with the Crossref icon will open in a new tab.

People also read
Recommended articles
Cited by

To cite this article:

Reference style: APA Chicago Harvard

Citation copied to clipboard

Reference styles above use APA (6th edition), Chicago (16th edition) & Harvard (10th edition)

Download citation

Download a citation file in RIS format that can be imported by citation management software including EndNote, ProCite, RefWorks and Reference Manager.

Choose format: RIS BibTex RefWorks Direct Export

Choose options: Citation Citation & abstract Citation & references

Neural architectures for gender detection and speaker identification

References

Information for

Open access

Opportunities

Help and information

Your download is now in progress and you may close this window

Login or register to access this feature

Neural architectures for gender detection and speaker identification

References

Related research

To cite this article:

Download citation

Information for

Open access

Opportunities

Help and information

Keep up to date