Towards Explainable Machine Learning Models for Remote Sensing: Multi-modal and Uni-modal Applications for Natural Disaster

dc.contributor.advisorGangopadhyay, Dr. Aryya
dc.contributor.advisorRahnemoonfar, Dr. Maryam
dc.contributor.authorSarkar, Argho
dc.contributor.departmentInformation Systems
dc.contributor.programInformation Systems
dc.date.accessioned2024-08-09T17:12:19Z
dc.date.available2024-08-09T17:12:19Z
dc.date.issued2024-01-01
dc.description.abstractNatural disasters leave a path of devastation that must be managed efficiently to minimize their impact on human lives. Estimation of the damages and taking action based on those assessments are the most important two-step processes in post-disaster management efforts. With the recent progress of Artificial Intelligence (AI), many machine learning algorithms are utilizing to assess the damage. However, existing methods are less efficient and provide limited scene information. This thesis proposes a unique Vision-Language-Based multi-modal task namely Visual Question Answering (VQA) for efficient and comprehensive damage assessment. VQA enables the extraction of diverse information from images through natural language queries. This high-level scene information has the potential to optimize decision support systems, leading to increased efficiency and a reduction in the time required for search and rescue operations. On the other hand, when incorporating machine learning models into smart decision support systems, the issue of explainability in model outcomes becomes significant. In remote sensing, visual content is complex, and the available contextual information is often limited compared to the overall size of the images. In such scenarios, the model's performance may be susceptible to shortcut learning and lead to misleading results. Thus, ensuring proper explanations for model outputs becomes crucial. Motivated by the above issues, this thesis is dedicated to addressing two crucial aspects of remote sensing applications. Firstly, this thesis focused on developing an image-based question-answering framework for efficient damage assessment on remote sensing imagery. Secondly, this thesis focused on enhancing the trustworthiness of model outcomes by developing novel machine learning frameworks designed for remote sensing applications in multi-modal and uni-modal contexts. To achieve the first goal, two unique large-scale benchmark visual question-answering datasets for damage assessment namely FloodNet-VQA and RescueNet-VQA are proposed. For proper visual explanations of model outcomes, this thesis proposed novel supervised attention modules for the VQA. Proposed supervised attention modules provide auxiliary supervision in the attention-obtaining process so that the model can learn where to focus in the image content for a question to provide a rational answer. Proposed approaches showed improved explanations and achieved higher accuracy compared to the state-of-the-art VQA algorithms. Finally, for consistent and robust visual explanations in a uni-modal remote sensing task (e.g., image classification) a novel strategy is proposed. Within this framework, two distinct losses have been proposed to ensure consistency and robustness in visual explanations.
dc.formatapplication:pdf
dc.genredissertation
dc.identifierdoi:10.13016/m2iwje-bmsu
dc.identifier.other12879
dc.identifier.urihttp://hdl.handle.net/11603/35319
dc.languageen
dc.relation.isAvailableAtThe University of Maryland, Baltimore County (UMBC)
dc.relation.ispartofUMBC Information Systems Department Collection
dc.relation.ispartofUMBC Theses and Dissertations Collection
dc.relation.ispartofUMBC Graduate School Collection
dc.relation.ispartofUMBC Student Collection
dc.rightsThis item may be protected under Title 17 of the U.S. Copyright Law. It is made available by UMBC for non-commercial research and education. For permission to publish or reproduce, please see http://aok.lib.umbc.edu/specoll/repro.php or contact Special Collections at speccoll(at)umbc.edu
dc.sourceOriginal File Name: Sarkar_umbc_0434D_12879.pdf
dc.subjectComputer Vision
dc.subjectDamage Assessment
dc.subjectExplainability
dc.subjectMulti-modal Machine Learning
dc.subjectRemote Sensing
dc.subjectVisual Question Answering
dc.titleTowards Explainable Machine Learning Models for Remote Sensing: Multi-modal and Uni-modal Applications for Natural Disaster
dc.typeText
dcterms.accessRightsDistribution Rights granted to UMBC by the author.

Files

Original bundle

Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
Sarkar_umbc_0434D_12879.pdf
Size:
47.18 MB
Format:
Adobe Portable Document Format

License bundle

Now showing 1 - 1 of 1
No Thumbnail Available
Name:
Sarkar-ArghOpen.pdf
Size:
340.18 KB
Format:
Adobe Portable Document Format
Description: