Real-Time Violence Detection in Surveillance Streams
Abstract
Avi Verma
The escalating threat of violence in public spaces necessitates scalable, automated, and real-time detection systems. This study intro- duces a deep learning-based framework for real- time violence detection in surveillance streams, leveraging a fine-tuned DenseNet121 convolutional neural network optimized for process- ing Real-Time Streaming Protocol (RTSP) feeds. Trained on a curated subset of the UCF-Crime dataset, the model achieves 92% accuracy and a weighted F1-score of 0.91. Integrating OpenCV for frame capture, Flask for visualization, Mon- go DB for metadata management, and Dropbox for cloud storage, the system processes multiple RTSP streams concurrently at 30fps on a T4 GPU. This end-to-end pipeline offers a practical solution for smart city surveillance, transportation hubs, and institutional security, demonstrating scalability, robustness, and deploy ability. This manuscript extends our previous work previously shared as preprints to promote open science and repro- ducibility. It is available as a preprint on SSRN ,Tech Rxiv and on Zendo [1-3]. The com- plete source code, model files, and deployment instructions for the proposed real-time violence detection system are available at: GITHUB and dataset at: DATASET.

