Enhancing Satellite Object Localization with Dilated Convolutions and Attention-aided Spatial Pooling

dc.contributor.authorMostafa, Seraj Al Mahmud
dc.contributor.authorWang, Chenxi
dc.contributor.authorYue, Jia
dc.contributor.authorHozumi, Yuta
dc.contributor.authorWang, Jianwu
dc.date.accessioned2025-06-17T14:45:17Z
dc.date.available2025-06-17T14:45:17Z
dc.date.issued2025-05-08
dc.descriptionInternational conference on Advanced Machine Learning and Data Science (AMLDS) 2025, July 19th to 21st, Tokyo
dc.description.abstractObject localization in satellite imagery is particularly challenging due to the high variability of objects, low spatial resolution, and interference from noise and dominant features such as clouds and city lights. In this research, we focus on three satellite datasets: upper atmospheric Gravity Waves (GW), mesospheric Bores (Bore), and Ocean Eddies (OE), each presenting its own unique challenges. These challenges include the variability in the scale and appearance of the main object patterns, where the size, shape, and feature extent of objects of interest can differ significantly. To address these challenges, we introduce YOLO-DCAP, a novel enhanced version of YOLOv5 designed to improve object localization in these complex scenarios. YOLO-DCAP incorporates a Multi-scale Dilated Residual Convolution (MDRC) block to capture multi-scale features at scale with varying dilation rates, and an Attention-aided Spatial Pooling (AaSP) module to focus on the global relevant spatial regions, enhancing feature selection. These structural improvements help to better localize objects in satellite imagery. Experimental results demonstrate that YOLO-DCAP significantly outperforms both the YOLO base model and state-of-the-art approaches, achieving an average improvement of 20.95% in mAP50 and 32.23% in IoU over the base model, and 7.35% and 9.84% respectively over state-of-the-art alternatives, consistently across all three satellite datasets. These consistent gains across all three satellite datasets highlight the robustness and generalizability of the proposed approach. Our code is open sourced at https://github.com/AI-4-atmosphere-remote-sensing/satellite-object-localization.
dc.description.sponsorshipThis work is supported by NSF grant OAC1942714 and NASA grants 80NSSC22K0641 and 80NSSC21M0027 We would like to thank Dr Jinbo Wang and Benjamin Holt from NASA JPL for providing us with the Ocean Eddy data
dc.description.urihttp://arxiv.org/abs/2505.05599
dc.format.extent9 pages
dc.genreconference papers and proceedings
dc.genrepostprints
dc.identifierdoi:10.13016/m2locg-upqv
dc.identifier.urihttps://doi.org/10.48550/arXiv.2505.05599
dc.identifier.urihttp://hdl.handle.net/11603/38872
dc.language.isoen_US
dc.publisherAMLDS
dc.relation.isAvailableAtThe University of Maryland, Baltimore County (UMBC)
dc.relation.ispartofUMBC Information Systems Department
dc.relation.ispartofUMBC Center for Real-time Distributed Sensing and Autonomy
dc.relation.ispartofUMBC Joint Center for Earth Systems Technology (JCET)
dc.relation.ispartofUMBC Faculty Collection
dc.relation.ispartofUMBC Center for Accelerated Real Time Analysis
dc.relation.ispartofUMBC Computer Science and Electrical Engineering Department
dc.relation.ispartofUMBC GESTAR II
dc.relation.ispartofUMBC Student Collection
dc.rightsThis work was written as part of one of the author's official duties as an Employee of the United States Government and is therefore a work of the United States Government. In accordance with 17 U.S.C. 105, no copyright protection is available for such works under U.S. Law.
dc.rightsPublic Domain
dc.rights.urihttps://creativecommons.org/publicdomain/mark/1.0/
dc.subjectUMBC Big Data Analytics Lab
dc.subjectComputer Science - Computer Vision and Pattern Recognition
dc.subjectComputer Science - Artificial Intelligence
dc.titleEnhancing Satellite Object Localization with Dilated Convolutions and Attention-aided Spatial Pooling
dc.typeText
dcterms.creatorhttps://orcid.org/0000-0002-9933-1170

Files

Original bundle

Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
2505.05599v1.pdf
Size:
8.03 MB
Format:
Adobe Portable Document Format