An Elliptical Modeling Supported System for Human Action Deep Recognition Over Aerial Surveillance

Usman Azmat; Saud S. Alotaibi; Naif Al Mudawi; Bayan Ibrahimm Alabduallah; Mohammed Alonazi; Ahmad Jalal; Jeongmin Park

doi:10.1109/ACCESS.2023.3266774

An Elliptical Modeling Supported System for Human Action Deep Recognition Over Aerial Surveillance

Usman Azmat, Saud S. Alotaibi, Naif Al Mudawi, Bayan Ibrahimm Alabduallah, Mohammed Alonazi, Ahmad Jalal, Jeongmin Park

Information Systems

Research output: Contribution to journal › Article › peer-review

35 Scopus citations

Abstract

The advancement of computer vision technology has led to the development of sophisticated algorithms capable of accurately recognizing human actions from red-green-blue videos recorded by drone cameras. Hence, possessing an exceptional potential, human action recognition also faces many challenges including, tendency of humans to perform the same action in different ways, limited camera angles, and field of view. In this research article, a system has been proposed to tackle the forementioned challenges by using red-green-blue videos as input while the videos were recorded by drone cameras. First of all, the video was split into its constituent frames and then gamma correction was applied on each frame to obtain an optimized version of the image. Then the Felzenszwalb's algorithm performed the segmentation to segment out human from the input image and human silhouette was generated. Utilizing the silhouette, skeleton was extracted to spot thirteen body key points. The key points were then used to perform elliptical modeling to estimate the individual boundaries of the body parts while the elliptical modeling was governed by the Gaussian mixture model-expectation maximization algorithm. The elliptical models of the body parts were utilized to spot fiducial points that if tracked, could provide very useful information about the performed action. Some other features that were extracted for this study include, the 3d point cloud feature vector, relative distance and velocity of the key-points, and their mutual angles. The features were then forwarded for optimization under a quadratic discriminant analysis and finally, a convolutional neural network was trained to perform the action classification. Three benchmark datasets including, the Drone-Action dataset, the UAV-Human dataset, and the Okutama-Action dataset were used for a comprehensive experimentation. The system outperformed the state-of-the-art approaches by securing accuracies of 80.03%, 48.60%, and 78.01% over the Drone-Action dataset, the UAV-Human dataset, and the Okutama-Action dataset respectively.

Original language	English
Pages (from-to)	75671-75685
Number of pages	15
Journal	IEEE Access
Volume	11
DOIs	https://doi.org/10.1109/ACCESS.2023.3266774
State	Published - 2023

Keywords

Classification
deep leaning
drone
Felzenszwalb's segmentation
Gaussian mixture model
human action recognition

Access to Document

10.1109/ACCESS.2023.3266774

Cite this

@article{d4b0576584eb438a8cf14b6fe5133905,

title = "An Elliptical Modeling Supported System for Human Action Deep Recognition Over Aerial Surveillance",

abstract = "The advancement of computer vision technology has led to the development of sophisticated algorithms capable of accurately recognizing human actions from red-green-blue videos recorded by drone cameras. Hence, possessing an exceptional potential, human action recognition also faces many challenges including, tendency of humans to perform the same action in different ways, limited camera angles, and field of view. In this research article, a system has been proposed to tackle the forementioned challenges by using red-green-blue videos as input while the videos were recorded by drone cameras. First of all, the video was split into its constituent frames and then gamma correction was applied on each frame to obtain an optimized version of the image. Then the Felzenszwalb's algorithm performed the segmentation to segment out human from the input image and human silhouette was generated. Utilizing the silhouette, skeleton was extracted to spot thirteen body key points. The key points were then used to perform elliptical modeling to estimate the individual boundaries of the body parts while the elliptical modeling was governed by the Gaussian mixture model-expectation maximization algorithm. The elliptical models of the body parts were utilized to spot fiducial points that if tracked, could provide very useful information about the performed action. Some other features that were extracted for this study include, the 3d point cloud feature vector, relative distance and velocity of the key-points, and their mutual angles. The features were then forwarded for optimization under a quadratic discriminant analysis and finally, a convolutional neural network was trained to perform the action classification. Three benchmark datasets including, the Drone-Action dataset, the UAV-Human dataset, and the Okutama-Action dataset were used for a comprehensive experimentation. The system outperformed the state-of-the-art approaches by securing accuracies of 80.03\%, 48.60\%, and 78.01\% over the Drone-Action dataset, the UAV-Human dataset, and the Okutama-Action dataset respectively.",

keywords = "Classification, deep leaning, drone, Felzenszwalb's segmentation, Gaussian mixture model, human action recognition",

author = "Usman Azmat and Alotaibi, \{Saud S.\} and Mudawi, \{Naif Al\} and Alabduallah, \{Bayan Ibrahimm\} and Mohammed Alonazi and Ahmad Jalal and Jeongmin Park",

note = "Publisher Copyright: {\textcopyright} 2013 IEEE.",

year = "2023",

doi = "10.1109/ACCESS.2023.3266774",

language = "English",

volume = "11",

pages = "75671--75685",

journal = "IEEE Access",

issn = "2169-3536",

publisher = "Institute of Electrical and Electronics Engineers Inc.",

}

TY - JOUR

T1 - An Elliptical Modeling Supported System for Human Action Deep Recognition Over Aerial Surveillance

AU - Azmat, Usman

AU - Alotaibi, Saud S.

AU - Mudawi, Naif Al

AU - Alabduallah, Bayan Ibrahimm

AU - Alonazi, Mohammed

AU - Jalal, Ahmad

AU - Park, Jeongmin

PY - 2023

Y1 - 2023

N2 - The advancement of computer vision technology has led to the development of sophisticated algorithms capable of accurately recognizing human actions from red-green-blue videos recorded by drone cameras. Hence, possessing an exceptional potential, human action recognition also faces many challenges including, tendency of humans to perform the same action in different ways, limited camera angles, and field of view. In this research article, a system has been proposed to tackle the forementioned challenges by using red-green-blue videos as input while the videos were recorded by drone cameras. First of all, the video was split into its constituent frames and then gamma correction was applied on each frame to obtain an optimized version of the image. Then the Felzenszwalb's algorithm performed the segmentation to segment out human from the input image and human silhouette was generated. Utilizing the silhouette, skeleton was extracted to spot thirteen body key points. The key points were then used to perform elliptical modeling to estimate the individual boundaries of the body parts while the elliptical modeling was governed by the Gaussian mixture model-expectation maximization algorithm. The elliptical models of the body parts were utilized to spot fiducial points that if tracked, could provide very useful information about the performed action. Some other features that were extracted for this study include, the 3d point cloud feature vector, relative distance and velocity of the key-points, and their mutual angles. The features were then forwarded for optimization under a quadratic discriminant analysis and finally, a convolutional neural network was trained to perform the action classification. Three benchmark datasets including, the Drone-Action dataset, the UAV-Human dataset, and the Okutama-Action dataset were used for a comprehensive experimentation. The system outperformed the state-of-the-art approaches by securing accuracies of 80.03%, 48.60%, and 78.01% over the Drone-Action dataset, the UAV-Human dataset, and the Okutama-Action dataset respectively.

AB - The advancement of computer vision technology has led to the development of sophisticated algorithms capable of accurately recognizing human actions from red-green-blue videos recorded by drone cameras. Hence, possessing an exceptional potential, human action recognition also faces many challenges including, tendency of humans to perform the same action in different ways, limited camera angles, and field of view. In this research article, a system has been proposed to tackle the forementioned challenges by using red-green-blue videos as input while the videos were recorded by drone cameras. First of all, the video was split into its constituent frames and then gamma correction was applied on each frame to obtain an optimized version of the image. Then the Felzenszwalb's algorithm performed the segmentation to segment out human from the input image and human silhouette was generated. Utilizing the silhouette, skeleton was extracted to spot thirteen body key points. The key points were then used to perform elliptical modeling to estimate the individual boundaries of the body parts while the elliptical modeling was governed by the Gaussian mixture model-expectation maximization algorithm. The elliptical models of the body parts were utilized to spot fiducial points that if tracked, could provide very useful information about the performed action. Some other features that were extracted for this study include, the 3d point cloud feature vector, relative distance and velocity of the key-points, and their mutual angles. The features were then forwarded for optimization under a quadratic discriminant analysis and finally, a convolutional neural network was trained to perform the action classification. Three benchmark datasets including, the Drone-Action dataset, the UAV-Human dataset, and the Okutama-Action dataset were used for a comprehensive experimentation. The system outperformed the state-of-the-art approaches by securing accuracies of 80.03%, 48.60%, and 78.01% over the Drone-Action dataset, the UAV-Human dataset, and the Okutama-Action dataset respectively.

KW - Classification

KW - deep leaning

KW - drone

KW - Felzenszwalb's segmentation

KW - Gaussian mixture model

KW - human action recognition

UR - http://www.scopus.com/inward/record.url?scp=85153367255&partnerID=8YFLogxK

U2 - 10.1109/ACCESS.2023.3266774

DO - 10.1109/ACCESS.2023.3266774

M3 - Article

AN - SCOPUS:85153367255

SN - 2169-3536

VL - 11

SP - 75671

EP - 75685

JO - IEEE Access

JF - IEEE Access

ER -

An Elliptical Modeling Supported System for Human Action Deep Recognition Over Aerial Surveillance

Abstract

Keywords

Access to Document

Other files and links

Fingerprint

Cite this