Deep Learning Approaches for Cloud Property Retrieval: Leveraging Geospatial Foundation Models and Multitask Frameworks

Department

Program

Citation of Original Publication

Rights

This work was written as part of one of the author's official duties as an Employee of the United States Government and is therefore a work of the United States Government. In accordance with 17 U.S.C. 105, no copyright protection is available for such works under U.S. Law.
Public Domain

Abstract

With the rapid growth of Earth-observation datasets, geospatial foundation models (FMs) provide a scalable approach to learn transferable features across diverse satellite sensor data. However, their cross-sensor adaptation ability needs more exploration. To study this issue, we present a benchmarking study of SatVision-TOA, an FM pre-trained on over 20 years of MODIS data, when adapted to the GOES NOAA ABI sensor for four downstream cloud properties: cloud mask, cloud phase (segmentation), and cloud optical depth (COD) and cloud particle size (CPS) (regression). We propose a multi-task learning fine-tuning pipeline with a U-Net-based decoder and a lightweight preprocessor to address band-mismatch handling (14 MODIS bands for pre-training vs. 16 ABI bands for fine-tuning). To evaluate our pipeline, we benchmark fine-tuned models against from-scratch baselines, evaluate full fine-tuning (FFT) versus parameter-efficient fine-tuning (PEFT) methods (LoRA, VPT), and compare 14-band versus 16-band inputs. Our experiments show that multi-task learning improves efficiency and predictive quality in both fine-tuned and from-scratch settings. For the other four comparisons (FT vs. from-scratch, FFT vs. PEFT, 14-bands vs. 16-bands and loss functions), the results are mixed and there is no setup that always performs the best for all segmentation and regression tasks.