TY - JOUR
T1 - Remote Sensing Surveillance Using Multilevel Feature Fusion and Deep Neural Network
AU - Zahoor, Laiba
AU - Alhasson, Haifa F.
AU - Alnusayri, Mohammed
AU - Alatiyyah, Mohammed
AU - Alhammadi, Dina Abdulaziz
AU - Jalal, Ahmad
AU - Liu, Hui
N1 - Publisher Copyright:
© 2013 IEEE.
PY - 2025
Y1 - 2025
N2 - Human action recognition from aerial imagery poses significant challenges due to the dynamic nature of the scenes and the complexity of human movements. In this paper, we present an enhanced system that combines YOLO for human detection with a complete multilevel feature fusion approach to improve recognition of human actions in drone-captured photos. Our system presents a reliable drone-based human action system through the integration of state-of-the-art methods for multilevel feature extraction and object detection. Initially, frames are extracted individually from drone footage sequences. Preprocessing techniques, which include Gaussian blur, grayscale conversion, and background removal, are used for every frame to improve image quality and feature reliability. For object detection, we effectively locate and recognize human subjects in these aerial frames using YOLO approach. Afterward, the framework extracts 14 body landmarks that represent the shape of the human body by keypoint extraction. Four significant features are employed to capture the complexity of human movement effectively: the incorporation of 3D point cloud data adds depth to the image and makes it feasible to construct a more detailed three-dimensional representation; measuring the angles between keypoints provides significant details on joint orientations which are essential for posture analysis; and geodesic distance measure the shortest paths along the surface of the body to provide useful insight into the spatial relationships between keypoints. The extracted features are optimized by using quadratic discriminant analysis. In the end, a deep neural network was trained to perform the action classification. Three benchmark datasets, the UAV Gesture, UAV Human, and UCF-ARG datasets, were used for our experiments and system testing. Our model achieved corresponding action recognition accuracy values of 90.15%, 72.37%, and 76.50% on each of these datasets.
AB - Human action recognition from aerial imagery poses significant challenges due to the dynamic nature of the scenes and the complexity of human movements. In this paper, we present an enhanced system that combines YOLO for human detection with a complete multilevel feature fusion approach to improve recognition of human actions in drone-captured photos. Our system presents a reliable drone-based human action system through the integration of state-of-the-art methods for multilevel feature extraction and object detection. Initially, frames are extracted individually from drone footage sequences. Preprocessing techniques, which include Gaussian blur, grayscale conversion, and background removal, are used for every frame to improve image quality and feature reliability. For object detection, we effectively locate and recognize human subjects in these aerial frames using YOLO approach. Afterward, the framework extracts 14 body landmarks that represent the shape of the human body by keypoint extraction. Four significant features are employed to capture the complexity of human movement effectively: the incorporation of 3D point cloud data adds depth to the image and makes it feasible to construct a more detailed three-dimensional representation; measuring the angles between keypoints provides significant details on joint orientations which are essential for posture analysis; and geodesic distance measure the shortest paths along the surface of the body to provide useful insight into the spatial relationships between keypoints. The extracted features are optimized by using quadratic discriminant analysis. In the end, a deep neural network was trained to perform the action classification. Three benchmark datasets, the UAV Gesture, UAV Human, and UCF-ARG datasets, were used for our experiments and system testing. Our model achieved corresponding action recognition accuracy values of 90.15%, 72.37%, and 76.50% on each of these datasets.
KW - Human action recognition
KW - aerial imaging
KW - body pose
KW - deep learning
KW - image analysis
KW - multilevel feature fusion
KW - object detectors
UR - http://www.scopus.com/inward/record.url?scp=105001059056&partnerID=8YFLogxK
U2 - 10.1109/ACCESS.2025.3542435
DO - 10.1109/ACCESS.2025.3542435
M3 - Article
AN - SCOPUS:105001059056
SN - 2169-3536
VL - 13
SP - 38282
EP - 38300
JO - IEEE Access
JF - IEEE Access
ER -