Modified layer deep convolution neural network for text-independent speaker recognition

V Karthikeyana Department of Electronics and Communication Engineering, Kalasalingam Institute of Technology, Krishnankoil, Tamilnadu, IndiaCorrespondence[email protected]

Suja Priyadharsini Sb Department of Electronics and Communication Engineering, Anna University Regional Campus-Tirunelveli, Tirunelveli, Tamilnadu, India

https://orcid.org/0000-0002-3926-5263

ABSTRACT

Speaker recognition is the task of identifying the spokesman automatically using speaker-specific features. It has been a popular and most involved topic in the field of speech technology. This field opens a wide opportunity for research and finds its application in the areas such as forensics, authentication, security, etc. In this work, a modified deep-convolutional neural network structure has been proposed for speaker identification that has improved convolution, activation, and pooling layers along with Adam’s optimiser. The proposed architecture yielded the increase of prediction accuracy and reduction of Loss function when compared to the generic Convolutional Neural Network scheme. The execution of the proposed architecture is validated by various datasets and the outcomes show that the modified CNN performs better than the other state-of-the-art models regarding both accuracy (avg 99%) and loss function (avg 1%). From the analysis, it is found that the Modified-CNN suits the best for real-time speaker identification applications as the efficacy of the model does not degrade due to the effects of noise and interferences that are caused in the recording environment. Relevance of the work: Speaker Recognition is an area of interest in which ML and DL schemes, when combined, have the potential to make history in the areas of Automation and Authentication. Using a modified CNN can enhance the process by ignoring many issues such as false positives, background noise, and so on. This process can be expanded to create a Raga Identification and Therapy mechanism that can be used to treat diseases.

KEYWORDS:

Disclosure statement

No potential conflict of interest was reported by the author(s).

Reprints and Corporate Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

To request a reprint or corporate permissions for this article, please click on the relevant link below:

Order Reprints Request Corporate Permissions

Academic Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

Obtain permissions instantly via Rightslink by clicking on the button below:

Request Academic Permissions

If you are unable to obtain permissions via Rightslink, please complete and submit this Permissions form. For more information, please visit our Permissions help page.

Modified layer deep convolution neural network for text-independent speaker recognition

Information for

Open access

Opportunities

Help and information

Modified layer deep convolution neural network for text-independent speaker recognition

ABSTRACT

Disclosure statement

Reprints and Corporate Permissions

Academic Permissions

Related research

To cite this article:

Download citation

Information for

Open access

Opportunities

Help and information

Keep up to date

Your download is now in progress and you may close this window

Login or register to access this feature