Remote Sensing Surveillance Using Multilevel Feature Fusion and Deep Neural Network

Laiba Zahoor, Haifa F. Alhasson, Mohammed Alnusayri, Mohammed Alatiyyah, Dina Abdulaziz Alhammadi, Ahmad Jalal, Hui Liu

Research output: Contribution to journalArticlepeer-review

3 Scopus citations

Abstract

Human action recognition from aerial imagery poses significant challenges due to the dynamic nature of the scenes and the complexity of human movements. In this paper, we present an enhanced system that combines YOLO for human detection with a complete multilevel feature fusion approach to improve recognition of human actions in drone-captured photos. Our system presents a reliable drone-based human action system through the integration of state-of-the-art methods for multilevel feature extraction and object detection. Initially, frames are extracted individually from drone footage sequences. Preprocessing techniques, which include Gaussian blur, grayscale conversion, and background removal, are used for every frame to improve image quality and feature reliability. For object detection, we effectively locate and recognize human subjects in these aerial frames using YOLO approach. Afterward, the framework extracts 14 body landmarks that represent the shape of the human body by keypoint extraction. Four significant features are employed to capture the complexity of human movement effectively: the incorporation of 3D point cloud data adds depth to the image and makes it feasible to construct a more detailed three-dimensional representation; measuring the angles between keypoints provides significant details on joint orientations which are essential for posture analysis; and geodesic distance measure the shortest paths along the surface of the body to provide useful insight into the spatial relationships between keypoints. The extracted features are optimized by using quadratic discriminant analysis. In the end, a deep neural network was trained to perform the action classification. Three benchmark datasets, the UAV Gesture, UAV Human, and UCF-ARG datasets, were used for our experiments and system testing. Our model achieved corresponding action recognition accuracy values of 90.15%, 72.37%, and 76.50% on each of these datasets.

Original languageEnglish
Pages (from-to)38282-38300
Number of pages19
JournalIEEE Access
Volume13
DOIs
StatePublished - 2025

Keywords

  • Human action recognition
  • aerial imaging
  • body pose
  • deep learning
  • image analysis
  • multilevel feature fusion
  • object detectors

Fingerprint

Dive into the research topics of 'Remote Sensing Surveillance Using Multilevel Feature Fusion and Deep Neural Network'. Together they form a unique fingerprint.

Cite this