79
Views
0
CrossRef citations to date
0
Altmetric
Articles

Voice activity detection for audio signal of voyage data recorder using residue network and attention mechanism

, , , &
Pages 243-251 | Received 04 Sep 2022, Accepted 09 Dec 2022, Published online: 29 Dec 2022

References

  • Bai S, Yan X. 2021. Low SNR speech endpoint detection based on Mel scale frequency cepstrum coefficient and short time energy. J Nanjing Normal Univ (Nat Sci Ed). 44(2):117–120.
  • Cheliotis M, Lazakis I, Theotokatos G. 2020. Machine learning and data-driven fault detection for ship systems operations. Ocean Eng. 216:107968.
  • Chen H, Tian D, Yang Z, Wu Z, Zhao K, Li D, Ma X. 2016. An improved position errors test method of image recorded by voyage data recorder. In: International Conference on Modelling, Identification and Control. Algiers (Algeria): IEEE. p. 80–84.
  • Davis S, Mermelstein P. 1980. Comparison of parametric representations for monosyllabic word recognition in continuously spoken sentences. IEEE Trans Audio Speech Lang Process. 28(4):357–366.
  • Geng QS, Wang FH, Zhou DX. 2019. Mechanical fault diagnosis of power transformer by GFCC time-frequency map of acoustic signal and convolutional neural network. In: IEEE Sustainable Power and Energy Conference. Beijing (China): IEEE. p. 2106–2110.
  • He K, Zhang X, Ren S, Sun J. 2016. Identity mappings in deep residual networks. In: European Conference on Computer Vision. Amsterdam (The Netherlands): Springer. p. 630–645.
  • Hidayat R, Bejo A, Sumaryono S, Winursito A. 2018. Denoising speech for MFCC feature extraction using Wavelet transformation in speech recognition system. In: International Conference on Information Technology and Electrical Engineering. Bali (Indonesia): IEEE. p. 280–284.
  • JP JISC. 2021. Ships and marine technology – guidelines for the operation and installation of voyage data recorder (VDR).
  • Juvela L, Bollepalli B, Wang X, Kameoka H, Airaksinen M, Yamagishi J, Alku P. 2018. Speech waveform synthesis from MFCC sequences with generative adversarial networks. In: IEEE International Conference on Acoustics, Speech and Signal Processing. Calgary (AB): IEEE. p. 5679–5683.
  • Kingma DP, Ba J. 2015. Adam: a method for stochastic optimization. In: Bengio Y, LeCun Y, editors. International Conference on Learning Representations. San Diego (CA): ICLR.
  • Lee Y, Min J, Han DK, Ko H. 2020. Spectro-temporal attention-based voice activity detection. IEEE Signal Proc Lett. 27:131–135.
  • Li Q, Zhu H, Qiao F, Wei Q, Liu X, Yang H. 2018. Energy-efficient MFCC extraction architecture in mixed-signal domain for automatic speech recognition. In: IEEE/ACM International Symposium on Nanoscale Architectures. Athens (Greece): IEEE. p. 1–3.
  • Likitha MS, Gupta SRR, Hasitha K, Raju AU. 2017. Speech based human emotion recognition using MFCC. In: International Conference on Wireless Communications, Signal Processing and Networking. Xiamen (China): IEEE. p. 2257–2260.
  • Liu S, Jiang N, Zhang L, Xu D. 2006. Research on the radar image compression of VDR based on SPIHT. In: International Conference on Mechatronics and Automation. Luoyang (China): IEEE. p. 2357–2361.
  • Maurya A, Kumar D, Agarwal RK. 2018. Speaker recognition for Hindi speech signal using MFCC-GMM approach. Procedia Comput Sci. 125:880–887.
  • Maya BND, Kurt RE. 2022. Marine accident learning with fuzzy cognitive maps: a method to model and weight human-related contributing factors into maritime accidents. Ships Offshore Struct. 17(3):555–563.
  • Mondal A, Prathosh A. 2020. RespVAD: voice activity detection via video-extracted respiration patterns. IEEE Sens Lett. 4(9):1–4.
  • Ning Y, Zhao L, Zhang C, Yuan Z. 2022. STD-Yolov5: a ship-type detection model based on improved Yolov5. Ships Offshore Struct. 17:1–10.
  • Pang J. 2017. Spectrum energy based voice activity detection. In: IEEE Annual Computing and Communication Workshop and Conference. Las Vegas (NV): IEEE. p. 1–5.
  • Sasa K, Chen C, Fujimatsu T, Shoji R, Maki A. 2021. Speed loss analysis and rough wave avoidance algorithms for optimal ship routing simulation of 28, 000-DWT bulk carrier. Ocean Eng. 228:108800.
  • Shaw HJ, Lin CK. 2021. Marine big data analysis of ships for the energy efficiency changes of the hull and maintenance evaluation based on the ISO 19030 standard. Ocean Eng. 232:108953.
  • Tan X, Zhang XL. 2021. Speech enhancement aided end-to-end multi-task learning for voice activity detection. In: International Conference on Acoustics, Speech and Signal Processing. Toronto (ON): IEEE. p. 6823–6827.
  • Tapkir PA, Patil AT, Shah N, Patil HA. 2018. Novel spectral root cepstral features for replay spoof detection. In: Asia-Pacific Signal and Information Processing Association Annual Summit and Conference. Honolulu (HI): IEEE. p. 1945–1950.
  • Teo JH, Cheng S, Alioto M. 2020. Low-energy voice activity detection via energy-quality scaling from data conversion to machine learning. IEEE Trans Circuits Syst I Fundam Theory Appl. 67(4):1378–1388.
  • Unnikrishnan MV, Rajan R. 2017. Mimicking voice recognition using MFCC-GMM framework. In: International Conference on Trends in Electronics and Informatics. Tirunelveli (India): IEEE. p. 301–304.
  • Valin JM. 2018. A hybrid DSP/deep learning approach to real-time full-band speech enhancement. In: IEEE 20th International Workshop on Multimedia Signal Processing. Vancouver (BC): IEEE. p. 1–5.
  • Wang H. 2017. Two-step judgment algorithm for robust voice activity detection based on deep neural networks. In: International Conference on Computer Technology, Electronics and Communication. Dalian (China): IEEE. p. 452–455.
  • Wang H, Xu Y, Li M. 2011. Study on the MFCC similarity-based voice activity detection algorithm. In: International Conference on Artificial Intelligence, Management Science and Electronic Commerce. Zhengzhou (China): IEEE. p. 4391–4394.
  • Wu B, Li G, Wang T, Hildre HP, Zhang H. 2021. Sailing status recognition to enhance safety awareness and path routing for a commuter ferry. Ships Offshore Struct. 16(1):1–12.
  • Yang Y, Chen P, Ding K, Chen Z, Hu K. 2022. Object detection of inland waterway ships based on improved SSD model. Ships Offshore Struct. 17(8):1–9.
  • Yang Y, Ding K, Chen Z. 2022. Ship classification based on convolutional neural networks. Ships Offshore Struct. 17(12):2715–2721.
  • Yu H, Zhu WP, Champagne B. 2020. Speech enhancement using a DNN-augmented colored-noise Kalman filter. Speech Commun. 125:142–151.
  • Yu Y, Kim YJ. 2018. A voice activity detection model composed of Bidirectional LSTM and attention mechanism. In: International Conference on Humanoid, Nanotechnology, Information Technology, Communication and Control, Environment and Management. Baguio City (Philippines): IEEE. p. 1–5.
  • Zaw TH, War N. 2017. The combination of spectral entropy, zero crossing rate, short time energy and linear prediction error for voice activity detection. In: International Conference of Computer and Information Technology. Dhaka (Bangladesh): IEEE. p. 1–5.
  • Zhang Y, Xing L. 2016. Research on speech endpoint detection algorithm based on Wavelet analysis and PSO-ELM. J North Univ of China (Nat Sci Ed). 37(1):33–38.
  • Zhao X, Shao Y, Wang D. 2012. Research on speech endpoint detection algorithm based on wavelet analysis and PSO-ELM. IEEE/ACM Trans Audio Speech Lang Process. 20(5):1608–1616.
  • Zhu M, Wu X, Lu Z, Wang T, Zhu X. 2019. Long-term speech information based threshold for voice activity detection in massive microphone network. Digit Signal Process. 94:156–164.

Reprints and Corporate Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

To request a reprint or corporate permissions for this article, please click on the relevant link below:

Academic Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

Obtain permissions instantly via Rightslink by clicking on the button below:

If you are unable to obtain permissions via Rightslink, please complete and submit this Permissions form. For more information, please visit our Permissions help page.