TY - JOUR
T1 - An Elliptical Modeling Supported System for Human Action Deep Recognition Over Aerial Surveillance
AU - Azmat, Usman
AU - Alotaibi, Saud S.
AU - Mudawi, Naif Al
AU - Alabduallah, Bayan Ibrahimm
AU - Alonazi, Mohammed
AU - Jalal, Ahmad
AU - Park, Jeongmin
N1 - Publisher Copyright:
© 2013 IEEE.
PY - 2023
Y1 - 2023
N2 - The advancement of computer vision technology has led to the development of sophisticated algorithms capable of accurately recognizing human actions from red-green-blue videos recorded by drone cameras. Hence, possessing an exceptional potential, human action recognition also faces many challenges including, tendency of humans to perform the same action in different ways, limited camera angles, and field of view. In this research article, a system has been proposed to tackle the forementioned challenges by using red-green-blue videos as input while the videos were recorded by drone cameras. First of all, the video was split into its constituent frames and then gamma correction was applied on each frame to obtain an optimized version of the image. Then the Felzenszwalb's algorithm performed the segmentation to segment out human from the input image and human silhouette was generated. Utilizing the silhouette, skeleton was extracted to spot thirteen body key points. The key points were then used to perform elliptical modeling to estimate the individual boundaries of the body parts while the elliptical modeling was governed by the Gaussian mixture model-expectation maximization algorithm. The elliptical models of the body parts were utilized to spot fiducial points that if tracked, could provide very useful information about the performed action. Some other features that were extracted for this study include, the 3d point cloud feature vector, relative distance and velocity of the key-points, and their mutual angles. The features were then forwarded for optimization under a quadratic discriminant analysis and finally, a convolutional neural network was trained to perform the action classification. Three benchmark datasets including, the Drone-Action dataset, the UAV-Human dataset, and the Okutama-Action dataset were used for a comprehensive experimentation. The system outperformed the state-of-the-art approaches by securing accuracies of 80.03%, 48.60%, and 78.01% over the Drone-Action dataset, the UAV-Human dataset, and the Okutama-Action dataset respectively.
AB - The advancement of computer vision technology has led to the development of sophisticated algorithms capable of accurately recognizing human actions from red-green-blue videos recorded by drone cameras. Hence, possessing an exceptional potential, human action recognition also faces many challenges including, tendency of humans to perform the same action in different ways, limited camera angles, and field of view. In this research article, a system has been proposed to tackle the forementioned challenges by using red-green-blue videos as input while the videos were recorded by drone cameras. First of all, the video was split into its constituent frames and then gamma correction was applied on each frame to obtain an optimized version of the image. Then the Felzenszwalb's algorithm performed the segmentation to segment out human from the input image and human silhouette was generated. Utilizing the silhouette, skeleton was extracted to spot thirteen body key points. The key points were then used to perform elliptical modeling to estimate the individual boundaries of the body parts while the elliptical modeling was governed by the Gaussian mixture model-expectation maximization algorithm. The elliptical models of the body parts were utilized to spot fiducial points that if tracked, could provide very useful information about the performed action. Some other features that were extracted for this study include, the 3d point cloud feature vector, relative distance and velocity of the key-points, and their mutual angles. The features were then forwarded for optimization under a quadratic discriminant analysis and finally, a convolutional neural network was trained to perform the action classification. Three benchmark datasets including, the Drone-Action dataset, the UAV-Human dataset, and the Okutama-Action dataset were used for a comprehensive experimentation. The system outperformed the state-of-the-art approaches by securing accuracies of 80.03%, 48.60%, and 78.01% over the Drone-Action dataset, the UAV-Human dataset, and the Okutama-Action dataset respectively.
KW - Classification
KW - deep leaning
KW - drone
KW - Felzenszwalb's segmentation
KW - Gaussian mixture model
KW - human action recognition
UR - http://www.scopus.com/inward/record.url?scp=85153367255&partnerID=8YFLogxK
U2 - 10.1109/ACCESS.2023.3266774
DO - 10.1109/ACCESS.2023.3266774
M3 - Article
AN - SCOPUS:85153367255
SN - 2169-3536
VL - 11
SP - 75671
EP - 75685
JO - IEEE Access
JF - IEEE Access
ER -