An optimized multi-scale convolutional autoencoder for efficient abnormal event detection using rgb, depth and optical flow data

Research output: Contribution to journalArticlepeer-review

Abstract

In this study, we propose a novel framework for detecting abnormal events in surveillance videos, a critical yet challenging task in security applications. This research introduces a robust and efficient solution for video anomaly detection, offering substantial improvements in surveillance systems' ability to detect abnormal events, thereby contributing to enhanced security measures in public spaces. The proposed framework utilizes a Multiscale Convolutional Autoencoder (MSCAE) that processes inputs from RGB, depth, and optical flow video clips, enhancing the detection accuracy in complex scenes characterized by varying object scales, aspect ratios, and occlusions. To address the challenge of noise and preserve edges in video data, we implement a two-pass bilateral smooth filtering method, which is effective for noise-invariant, edge-preserving image smoothing. For object detection within these complex scenes, an enhanced Faster R-CNN model is employed. This model's performance is further refined through transfer learning on a dataset specifically composed of abnormal event videos. We also introduce significant improvements to the region proposal network (RPN) of the Faster R-CNN, particularly in non-maximum suppression (NMS) and anchor generation techniques, to better detect anomalies in diverse and complex environments. Furthermore, the MSCAE is integrated with Long Short-Term Memory (LSTM) neural networks to classify the detected anomalies, creating an end-to-end solution for video anomaly detection. Hyperparameter optimization for our deep learning models is performed using the Chameleon Swarm Algorithm, ensuring optimal model performance. Our framework was rigorously tested on the CUHK Avenue dataset, where it achieved a remarkable 99.5% accuracy, significantly outperforming existing methods and demonstrating the effectiveness of our approach.

Original languageEnglish
Pages (from-to)34401-34435
Number of pages35
JournalMultimedia Tools and Applications
Volume84
Issue number28
DOIs
StatePublished - Aug 2025

Keywords

  • Abnormal event detection
  • Deep learning
  • Feature fusion
  • Key frame extraction
  • Object detection
  • Optimization algorithm
  • Video anomaly detection

Fingerprint

Dive into the research topics of 'An optimized multi-scale convolutional autoencoder for efficient abnormal event detection using rgb, depth and optical flow data'. Together they form a unique fingerprint.

Cite this