Harris Hawks Sparse Auto-Encoder Networks for Automatic Speech Recognition System

Mohammed Hasan Ali, Mustafa Musa Jaber, Sura Khalil Abd, Amjad Rehman, Mazhar Javed Awan, Daiva Vitkutė-Adžgauskienė, Robertas Damaševičius, Saeed Ali Bahaj

Research output: Contribution to journalArticlepeer-review

34 Scopus citations

Abstract

Automatic speech recognition (ASR) is an effective technique that can convert human speech into text format or computer actions. ASR systems are widely used in smart appliances, smart homes, and biometric systems. Signal processing and machine learning techniques are incorporated to recognize speech. However, traditional systems have low performance due to a noisy environment. In addition to this, accents and local differences negatively affect the ASR system’s performance while analyzing speech signals. A precise speech recognition system was developed to improve the system performance to overcome these issues. This paper uses speech information from jim-schwoebel voice datasets processed by Mel-frequency cepstral coefficients (MFCCs). The MFCC algorithm extracts the valuable features that are used to recognize speech. Here, a sparse auto-encoder (SAE) neural network is used to classify the model, and the hidden Markov model (HMM) is used to decide on the speech recognition. The network performance is optimized by applying the Harris Hawks optimization (HHO) algorithm to fine-tune the network parameter. The fine-tuned network can effectively recognize speech in a noisy environment.

Original languageEnglish
Article number1091
JournalApplied Sciences (Switzerland)
Volume12
Issue number3
DOIs
StatePublished - 1 Feb 2022

Keywords

  • Automatic speech recognition
  • Hidden Markov model
  • Mel-frequency cepstral coefficients
  • Natural language processing
  • Sparse auto-encoder neural network
  • Speech recognition

Fingerprint

Dive into the research topics of 'Harris Hawks Sparse Auto-Encoder Networks for Automatic Speech Recognition System'. Together they form a unique fingerprint.

Cite this