TY - JOUR
T1 - A swin transformer-driven framework for gesture recognition to assist hearing impaired people by integrating deep learning with secretary bird optimization algorithm
AU - Assiri, Mohammed S.
AU - Selim, Mahmoud M.
N1 - Publisher Copyright:
© 2025 The Author(s)
PY - 2025/5
Y1 - 2025/5
N2 - Hand gestures (HG) are the key communication technique for hearing-impaired people, which poses a problem for millions of individuals globally after communicating with those who don't have hearing impairments. The importance of technology in improving accessibility and thus raising the standard of living for persons with hearing impairments is globally acclaimed. Machine learning (ML) is a section of artificial intelligence (AI) that concentrates on developing a method that depends on data. The major problem of HG recognition is that the machine does not identify the human language straightforwardly, and human–machine interaction is required of media for communication, which is determined by machines and, in addition to humans, to assist hearing-impaired individuals and ageing people. Thus, HG recognition as a communication media is necessary to provide instructions to the computer. This paper proposes the Swin Transformer-Driven Framework for Gesture Recognition by Integrating Deep Learning with the Secretary Bird Optimization (STFGR-IDLSBO) methodology. The main intention of the STFGR-IDLSBO methodology is to develop an efficient and robust system for gesture recognition to assist hearing-impaired persons. Initially, the proposed STFGR-IDLSBO method utilizes adaptive bilateral filtering (ABF) in the image pre-processing stage to reduce noise while preserving the edges of the gestures in the captured images. Furthermore, the swin transformer (ST) is a feature extractor that effectively captures multiscale representations and spatial hierarchies from gesture images. The hybrid model integrates the convolutional neural network and bi-directional long short-term memory (CNN-BiLSTM) technique, which is employed for the gesture classification process. Finally, the secretary bird optimizer algorithm (SBOA) is utilized for the optimum hyperparameter tuning of the CNN-BiLSTM classifier. To ensure the enhanced performance of the STFGR-IDLSBO methodology, a wide range simulation investigation is performed under the Traffic Police Gesture dataset. The performance validation of the STFGR-IDLSBO technique portrayed a superior accuracy value of 99.25% over existing methods.
AB - Hand gestures (HG) are the key communication technique for hearing-impaired people, which poses a problem for millions of individuals globally after communicating with those who don't have hearing impairments. The importance of technology in improving accessibility and thus raising the standard of living for persons with hearing impairments is globally acclaimed. Machine learning (ML) is a section of artificial intelligence (AI) that concentrates on developing a method that depends on data. The major problem of HG recognition is that the machine does not identify the human language straightforwardly, and human–machine interaction is required of media for communication, which is determined by machines and, in addition to humans, to assist hearing-impaired individuals and ageing people. Thus, HG recognition as a communication media is necessary to provide instructions to the computer. This paper proposes the Swin Transformer-Driven Framework for Gesture Recognition by Integrating Deep Learning with the Secretary Bird Optimization (STFGR-IDLSBO) methodology. The main intention of the STFGR-IDLSBO methodology is to develop an efficient and robust system for gesture recognition to assist hearing-impaired persons. Initially, the proposed STFGR-IDLSBO method utilizes adaptive bilateral filtering (ABF) in the image pre-processing stage to reduce noise while preserving the edges of the gestures in the captured images. Furthermore, the swin transformer (ST) is a feature extractor that effectively captures multiscale representations and spatial hierarchies from gesture images. The hybrid model integrates the convolutional neural network and bi-directional long short-term memory (CNN-BiLSTM) technique, which is employed for the gesture classification process. Finally, the secretary bird optimizer algorithm (SBOA) is utilized for the optimum hyperparameter tuning of the CNN-BiLSTM classifier. To ensure the enhanced performance of the STFGR-IDLSBO methodology, a wide range simulation investigation is performed under the Traffic Police Gesture dataset. The performance validation of the STFGR-IDLSBO technique portrayed a superior accuracy value of 99.25% over existing methods.
KW - Deep Learning
KW - Gesture Recognition
KW - Hearing-Impaired Person
KW - Secretary Bird Optimization Algorithm
KW - Swin Transformer
UR - http://www.scopus.com/inward/record.url?scp=105001318522&partnerID=8YFLogxK
U2 - 10.1016/j.asej.2025.103383
DO - 10.1016/j.asej.2025.103383
M3 - Article
AN - SCOPUS:105001318522
SN - 2090-4479
VL - 16
JO - Ain Shams Engineering Journal
JF - Ain Shams Engineering Journal
IS - 6
M1 - 103383
ER -