Multi-Layered Deep Learning Features Fusion for Human Action Recognition

Sadia Kiran; Muhammad Attique Khan; Muhammad Younus Javed; Majed Alhaisoni; Usman Tariq; Yunyoung Nam; Robertas Damaševǐcius; Muhammad Sharif

doi:10.32604/cmc.2021.017800

Multi-Layered Deep Learning Features Fusion for Human Action Recognition

Sadia Kiran, Muhammad Attique Khan, Muhammad Younus Javed, Majed Alhaisoni, Usman Tariq, Yunyoung Nam, Robertas Damaševǐcius, Muhammad Sharif

Management Information Systems

Research output: Contribution to journal › Article › peer-review

49 Scopus citations

Abstract

Human Action Recognition (HAR) is an active research topic in machine learning for the last few decades. Visual surveillance, robotics, and pedestrian detection are the main applications for action recognition. Computer vision researchers have introduced many HAR techniques, but they still face challenges such as redundant features and the cost of computing. In this article, we proposed a new method for the use of deep learning for HAR. In the proposed method, video frames are initially pre-processed using a global contrast approach and later used to train a deep learning model using domain transfer learning. The Resnet-50 Pre-TrainedModel is used as a deep learning model in this work. Features are extracted from two layers: Global Average Pool (GAP) and Fully Connected (FC). The features of both layers are fused by the Canonical Correlation Analysis (CCA). Then features are selected using the Shanon Entropy-based threshold function. The selected features are finally passed to multiple classifiers for final classification. Experiments are conducted on five publicly available datasets as IXMAS, UCF Sports, YouTube, UT-Interaction, and KTH. The accuracy of these data sets was 89.6%, 99.7%, 100%, 96.7% and 96.6%, respectively. Comparison with existing techniques has shown that the proposed method provides improved accuracy for HAR. Also, the proposed method is computationally fast based on the time of execution.

Original language	English
Pages (from-to)	4061-4075
Number of pages	15
Journal	Computers, Materials and Continua
Volume	69
Issue number	3
DOIs	https://doi.org/10.32604/cmc.2021.017800
State	Published - 2021

Keywords

Action recognition
Classification
Features fusion
Features selection
Transfer learning

Access to Document

10.32604/cmc.2021.017800

Cite this

@article{5b642326ec66447cb2354ed8ac3fa223,

title = "Multi-Layered Deep Learning Features Fusion for Human Action Recognition",

abstract = "Human Action Recognition (HAR) is an active research topic in machine learning for the last few decades. Visual surveillance, robotics, and pedestrian detection are the main applications for action recognition. Computer vision researchers have introduced many HAR techniques, but they still face challenges such as redundant features and the cost of computing. In this article, we proposed a new method for the use of deep learning for HAR. In the proposed method, video frames are initially pre-processed using a global contrast approach and later used to train a deep learning model using domain transfer learning. The Resnet-50 Pre-TrainedModel is used as a deep learning model in this work. Features are extracted from two layers: Global Average Pool (GAP) and Fully Connected (FC). The features of both layers are fused by the Canonical Correlation Analysis (CCA). Then features are selected using the Shanon Entropy-based threshold function. The selected features are finally passed to multiple classifiers for final classification. Experiments are conducted on five publicly available datasets as IXMAS, UCF Sports, YouTube, UT-Interaction, and KTH. The accuracy of these data sets was 89.6\%, 99.7\%, 100\%, 96.7\% and 96.6\%, respectively. Comparison with existing techniques has shown that the proposed method provides improved accuracy for HAR. Also, the proposed method is computationally fast based on the time of execution.",

keywords = "Action recognition, Classification, Features fusion, Features selection, Transfer learning",

author = "Sadia Kiran and Khan, \{Muhammad Attique\} and Javed, \{Muhammad Younus\} and Majed Alhaisoni and Usman Tariq and Yunyoung Nam and Robertas Dama{\v s}evǐcius and Muhammad Sharif",

year = "2021",

doi = "10.32604/cmc.2021.017800",

language = "English",

volume = "69",

pages = "4061--4075",

journal = "Computers, Materials and Continua",

issn = "1546-2218",

publisher = "Tech Science Press",

number = "3",

}

TY - JOUR

T1 - Multi-Layered Deep Learning Features Fusion for Human Action Recognition

AU - Kiran, Sadia

AU - Khan, Muhammad Attique

AU - Javed, Muhammad Younus

AU - Alhaisoni, Majed

AU - Tariq, Usman

AU - Nam, Yunyoung

AU - Damaševǐcius, Robertas

AU - Sharif, Muhammad

PY - 2021

Y1 - 2021

N2 - Human Action Recognition (HAR) is an active research topic in machine learning for the last few decades. Visual surveillance, robotics, and pedestrian detection are the main applications for action recognition. Computer vision researchers have introduced many HAR techniques, but they still face challenges such as redundant features and the cost of computing. In this article, we proposed a new method for the use of deep learning for HAR. In the proposed method, video frames are initially pre-processed using a global contrast approach and later used to train a deep learning model using domain transfer learning. The Resnet-50 Pre-TrainedModel is used as a deep learning model in this work. Features are extracted from two layers: Global Average Pool (GAP) and Fully Connected (FC). The features of both layers are fused by the Canonical Correlation Analysis (CCA). Then features are selected using the Shanon Entropy-based threshold function. The selected features are finally passed to multiple classifiers for final classification. Experiments are conducted on five publicly available datasets as IXMAS, UCF Sports, YouTube, UT-Interaction, and KTH. The accuracy of these data sets was 89.6%, 99.7%, 100%, 96.7% and 96.6%, respectively. Comparison with existing techniques has shown that the proposed method provides improved accuracy for HAR. Also, the proposed method is computationally fast based on the time of execution.

AB - Human Action Recognition (HAR) is an active research topic in machine learning for the last few decades. Visual surveillance, robotics, and pedestrian detection are the main applications for action recognition. Computer vision researchers have introduced many HAR techniques, but they still face challenges such as redundant features and the cost of computing. In this article, we proposed a new method for the use of deep learning for HAR. In the proposed method, video frames are initially pre-processed using a global contrast approach and later used to train a deep learning model using domain transfer learning. The Resnet-50 Pre-TrainedModel is used as a deep learning model in this work. Features are extracted from two layers: Global Average Pool (GAP) and Fully Connected (FC). The features of both layers are fused by the Canonical Correlation Analysis (CCA). Then features are selected using the Shanon Entropy-based threshold function. The selected features are finally passed to multiple classifiers for final classification. Experiments are conducted on five publicly available datasets as IXMAS, UCF Sports, YouTube, UT-Interaction, and KTH. The accuracy of these data sets was 89.6%, 99.7%, 100%, 96.7% and 96.6%, respectively. Comparison with existing techniques has shown that the proposed method provides improved accuracy for HAR. Also, the proposed method is computationally fast based on the time of execution.

KW - Action recognition

KW - Classification

KW - Features fusion

KW - Features selection

KW - Transfer learning

UR - http://www.scopus.com/inward/record.url?scp=85115907695&partnerID=8YFLogxK

U2 - 10.32604/cmc.2021.017800

DO - 10.32604/cmc.2021.017800

M3 - Article

AN - SCOPUS:85115907695

SN - 1546-2218

VL - 69

SP - 4061

EP - 4075

JO - Computers, Materials and Continua

JF - Computers, Materials and Continua

IS - 3

ER -

Multi-Layered Deep Learning Features Fusion for Human Action Recognition

Abstract

Keywords

Access to Document

Other files and links

Fingerprint

Cite this