HazardNet: A Small-Scale Vision Language Model for Real-Time Traffic Safety Detection at Edge Devices

dc.contributor.authorTami, Mohammad
dc.contributor.authorAbu Elhenawy, Mohammed
dc.contributor.authorAshqar, Huthaifa
dc.date.accessioned2025-10-16T15:27:13Z
dc.date.issued2025-02-27
dc.description.abstractTraffic safety remains a vital concern in contemporary urban settings, intensified by the increase of vehicles and the complicated nature of road networks. Traditional safety-critical event detection systems predominantly rely on sensor-based approaches and conventional machine learning algorithms, necessitating extensive data collection and complex training processes to adhere to traffic safety regulations. This paper introduces HazardNet, a small-scale Vision Language Model designed to enhance traffic safety by leveraging the reasoning capabilities of advanced language and vision models. We built HazardNet by fine-tuning the pre-trained Qwen2-VL-2B model, chosen for its superior performance among open-source alternatives and its compact size of two billion parameters. This helps to facilitate deployment on edge devices with efficient inference throughput. In addition, we present HazardQA, a novel Vision Question Answering (VQA) dataset constructed specifically for training HazardNet on real-world scenarios involving safety-critical events. Our experimental results show that the fine-tuned HazardNet outperformed the base model up to an 89% improvement in F1-Score and has comparable results with improvement in some cases reach up to 6% when compared to larger models, such as GPT-4o. These advancements underscore the potential of HazardNet in providing real-time, reliable traffic safety event detection, thereby contributing to reduced accidents and improved traffic management in urban environments. Both HazardNet model and the HazardQA dataset are available at https://huggingface.co/Tami3/HazardNet and https://huggingface.co/datasets/Tami3/HazardQA, respectively.
dc.description.urihttp://arxiv.org/abs/2502.20572
dc.format.extent5 pages
dc.genrejournal articles
dc.genrepreprints
dc.identifierdoi:10.13016/m26iwg-txnw
dc.identifier.urihttps://doi.org/10.48550/arXiv.2502.20572
dc.identifier.urihttp://hdl.handle.net/11603/40455
dc.language.isoen
dc.relation.isAvailableAtThe University of Maryland, Baltimore County (UMBC)
dc.relation.ispartofUMBC Data Science
dc.relation.ispartofUMBC Faculty Collection
dc.rightsAttribution 4.0 International
dc.rights.urihttps://creativecommons.org/licenses/by/4.0/
dc.subjectComputer Science - Computer Vision and Pattern Recognition
dc.subjectComputer Science - Computation and Language
dc.titleHazardNet: A Small-Scale Vision Language Model for Real-Time Traffic Safety Detection at Edge Devices
dc.typeText
dcterms.creatorhttps://orcid.org/0000-0002-6835-8338

Files

Original bundle

Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
HazardNet.pdf
Size:
519.16 KB
Format:
Adobe Portable Document Format