TY - JOUR
T1 - An optimized multi-scale convolutional autoencoder for efficient abnormal event detection using rgb, depth and optical flow data
AU - Alqahtani, Abdullah
N1 - Publisher Copyright:
© The Author(s), under exclusive licence to Springer Science+Business Media, LLC, part of Springer Nature 2025.
PY - 2025/8
Y1 - 2025/8
N2 - In this study, we propose a novel framework for detecting abnormal events in surveillance videos, a critical yet challenging task in security applications. This research introduces a robust and efficient solution for video anomaly detection, offering substantial improvements in surveillance systems' ability to detect abnormal events, thereby contributing to enhanced security measures in public spaces. The proposed framework utilizes a Multiscale Convolutional Autoencoder (MSCAE) that processes inputs from RGB, depth, and optical flow video clips, enhancing the detection accuracy in complex scenes characterized by varying object scales, aspect ratios, and occlusions. To address the challenge of noise and preserve edges in video data, we implement a two-pass bilateral smooth filtering method, which is effective for noise-invariant, edge-preserving image smoothing. For object detection within these complex scenes, an enhanced Faster R-CNN model is employed. This model's performance is further refined through transfer learning on a dataset specifically composed of abnormal event videos. We also introduce significant improvements to the region proposal network (RPN) of the Faster R-CNN, particularly in non-maximum suppression (NMS) and anchor generation techniques, to better detect anomalies in diverse and complex environments. Furthermore, the MSCAE is integrated with Long Short-Term Memory (LSTM) neural networks to classify the detected anomalies, creating an end-to-end solution for video anomaly detection. Hyperparameter optimization for our deep learning models is performed using the Chameleon Swarm Algorithm, ensuring optimal model performance. Our framework was rigorously tested on the CUHK Avenue dataset, where it achieved a remarkable 99.5% accuracy, significantly outperforming existing methods and demonstrating the effectiveness of our approach.
AB - In this study, we propose a novel framework for detecting abnormal events in surveillance videos, a critical yet challenging task in security applications. This research introduces a robust and efficient solution for video anomaly detection, offering substantial improvements in surveillance systems' ability to detect abnormal events, thereby contributing to enhanced security measures in public spaces. The proposed framework utilizes a Multiscale Convolutional Autoencoder (MSCAE) that processes inputs from RGB, depth, and optical flow video clips, enhancing the detection accuracy in complex scenes characterized by varying object scales, aspect ratios, and occlusions. To address the challenge of noise and preserve edges in video data, we implement a two-pass bilateral smooth filtering method, which is effective for noise-invariant, edge-preserving image smoothing. For object detection within these complex scenes, an enhanced Faster R-CNN model is employed. This model's performance is further refined through transfer learning on a dataset specifically composed of abnormal event videos. We also introduce significant improvements to the region proposal network (RPN) of the Faster R-CNN, particularly in non-maximum suppression (NMS) and anchor generation techniques, to better detect anomalies in diverse and complex environments. Furthermore, the MSCAE is integrated with Long Short-Term Memory (LSTM) neural networks to classify the detected anomalies, creating an end-to-end solution for video anomaly detection. Hyperparameter optimization for our deep learning models is performed using the Chameleon Swarm Algorithm, ensuring optimal model performance. Our framework was rigorously tested on the CUHK Avenue dataset, where it achieved a remarkable 99.5% accuracy, significantly outperforming existing methods and demonstrating the effectiveness of our approach.
KW - Abnormal event detection
KW - Deep learning
KW - Feature fusion
KW - Key frame extraction
KW - Object detection
KW - Optimization algorithm
KW - Video anomaly detection
UR - http://www.scopus.com/inward/record.url?scp=85217164149&partnerID=8YFLogxK
U2 - 10.1007/s11042-025-20608-5
DO - 10.1007/s11042-025-20608-5
M3 - Article
AN - SCOPUS:85217164149
SN - 1380-7501
VL - 84
SP - 34401
EP - 34435
JO - Multimedia Tools and Applications
JF - Multimedia Tools and Applications
IS - 28
ER -