Digital Repository

“PeaceKeeper” : Transformer-Based Multimodal Public Violence Detection System

Show simple item record

dc.contributor.author Wickrama Arachchi, Hashini
dc.date.accessioned 2025-06-09T03:26:25Z
dc.date.available 2025-06-09T03:26:25Z
dc.date.issued 2024
dc.identifier.citation Wickrama Arachchi, Hashini (2024) “PeaceKeeper” : Transformer-Based Multimodal Public Violence Detection System. BSc. Dissertation, Informatics Institute of Technology en_US
dc.identifier.issn 20200477
dc.identifier.uri http://dlib.iit.ac.lk/xmlui/handle/123456789/2468
dc.description.abstract Violence is a key aspect experienced by public in common premises where a group of people get together. During a violence occurrence, a main problem is that the public suffer from not having a prompt alert methodology and not being able to receive necessary evidences about the culprits and affected persons or property immediately for legal investigations. In fact, occurrence of violence is a swift scenario that lasts for a short period of time, which emphasizes the importance of real time accurate VD systems to be introduced which is capable of detecting both ongoing and future violence without any sort of human intervention. A system is proposed through this project to build a combination of trio transformer architectures which belongs to “Transformers”; a recent attention mechanism-based technology in computer vision domain. Video classification for fight detection is implemented using Video Vision Transformer (ViViT), image classification for weapon classification built using Vision Transformer (ViT) and the violent audio classification is to be achieved using Audio Spectrogram Transformer (AST). Currently, the implemented ViViT model archives an accuracy of 60.0% during 50 epochs with V100 GPU. The ViT model achieves 99.53% overall accuracy on the multiple classes. The overall accuracy of 59.06% is achieved by AST model for audio classification. en_US
dc.language.iso en en_US
dc.subject Transformers en_US
dc.subject Violence Detection en_US
dc.subject Video Vision Transformer en_US
dc.title “PeaceKeeper” : Transformer-Based Multimodal Public Violence Detection System en_US
dc.type Thesis en_US


Files in this item

This item appears in the following Collection(s)

Show simple item record

Search


Advanced Search

Browse

My Account