HazardNet: A Small-Scale Vision Language Model for Real-Time Traffic Safety Detection at Edge Devices
| dc.contributor.author | Tami, Mohammad | |
| dc.contributor.author | Abu Elhenawy, Mohammed | |
| dc.contributor.author | Ashqar, Huthaifa | |
| dc.date.accessioned | 2025-10-16T15:27:13Z | |
| dc.date.issued | 2025-02-27 | |
| dc.description.abstract | Traffic safety remains a vital concern in contemporary urban settings, intensified by the increase of vehicles and the complicated nature of road networks. Traditional safety-critical event detection systems predominantly rely on sensor-based approaches and conventional machine learning algorithms, necessitating extensive data collection and complex training processes to adhere to traffic safety regulations. This paper introduces HazardNet, a small-scale Vision Language Model designed to enhance traffic safety by leveraging the reasoning capabilities of advanced language and vision models. We built HazardNet by fine-tuning the pre-trained Qwen2-VL-2B model, chosen for its superior performance among open-source alternatives and its compact size of two billion parameters. This helps to facilitate deployment on edge devices with efficient inference throughput. In addition, we present HazardQA, a novel Vision Question Answering (VQA) dataset constructed specifically for training HazardNet on real-world scenarios involving safety-critical events. Our experimental results show that the fine-tuned HazardNet outperformed the base model up to an 89% improvement in F1-Score and has comparable results with improvement in some cases reach up to 6% when compared to larger models, such as GPT-4o. These advancements underscore the potential of HazardNet in providing real-time, reliable traffic safety event detection, thereby contributing to reduced accidents and improved traffic management in urban environments. Both HazardNet model and the HazardQA dataset are available at https://huggingface.co/Tami3/HazardNet and https://huggingface.co/datasets/Tami3/HazardQA, respectively. | |
| dc.description.uri | http://arxiv.org/abs/2502.20572 | |
| dc.format.extent | 5 pages | |
| dc.genre | journal articles | |
| dc.genre | preprints | |
| dc.identifier | doi:10.13016/m26iwg-txnw | |
| dc.identifier.uri | https://doi.org/10.48550/arXiv.2502.20572 | |
| dc.identifier.uri | http://hdl.handle.net/11603/40455 | |
| dc.language.iso | en | |
| dc.relation.isAvailableAt | The University of Maryland, Baltimore County (UMBC) | |
| dc.relation.ispartof | UMBC Data Science | |
| dc.relation.ispartof | UMBC Faculty Collection | |
| dc.rights | Attribution 4.0 International | |
| dc.rights.uri | https://creativecommons.org/licenses/by/4.0/ | |
| dc.subject | Computer Science - Computer Vision and Pattern Recognition | |
| dc.subject | Computer Science - Computation and Language | |
| dc.title | HazardNet: A Small-Scale Vision Language Model for Real-Time Traffic Safety Detection at Edge Devices | |
| dc.type | Text | |
| dcterms.creator | https://orcid.org/0000-0002-6835-8338 |
Files
Original bundle
1 - 1 of 1
