Grounding Stylistic Domain Generalization with Quantitative Domain Shift Measures and Synthetic Scene Images

dc.contributor.authorLuo, Yiran
dc.contributor.authorFeinglass, Joshua
dc.contributor.authorGokhale, Tejas
dc.contributor.authorLee, Kuan-Cheng
dc.contributor.authorBaral, Chitta
dc.contributor.authorYang, Yezhou
dc.date.accessioned2024-07-12T14:57:25Z
dc.date.available2024-07-12T14:57:25Z
dc.date.issued2024-05-24
dc.description3rd CVPR Workshop on Vision Datasets Understanding, 2024
dc.description.abstractDomain Generalization (DG) is a challenging task in machine learning that requires a coherent ability to comprehend shifts across various domains through extraction of domain-invariant features. DG performance is typically evaluated by performing image classification in domains of various image styles. However, current methodology lacks quantitative understanding about shifts in stylistic domain, and relies on a vast amount of pre-training data, such as ImageNet1K, which are predominantly in photorealistic style with weakly supervised class labels. Such a data-driven practice could potentially result in spurious correlation and inflated performance on DG benchmarks. In this paper, we introduce a new 3-part DG paradigm to address these risks. We first introduce two new quantitative measures ICV and IDD to describe domain shifts in terms of consistency of classes within one domain and similarity between two stylistic domains. We then present SuperMarioDomains (SMD), a novel synthetic multi-domain dataset sampled from video game scenes with more consistent classes and sufficient dissimilarity compared to ImageNet1K. We demonstrate our DG method SMOS. SMOS uses SMD to first train a precursor model, which is then used to ground the training on a DG benchmark. We observe that SMOS+SMD altogether contributes to stateof-the-art performance across five DG benchmarks, gaining large improvements to performances on abstract domains along with on-par or slight improvements to those on photo-realistic domains. Our qualitative analysis suggests that these improvements can be attributed to reduced distributional divergence between originally distant domains. Our data are available at https://github.com/ fpsluozi/SMD-SMOS .
dc.description.sponsorshipThe authors acknowledge Research Computing at Arizona State University for providing HPC resources and support for this work. This work was supported by NSF RI grants #1750082 and #2132724. The views and opinions of the authors expressed herein do not necessarily state or reflect those of the funding agencies and employers.
dc.description.urihttp://arxiv.org/abs/2405.15961
dc.format.extent11 pages
dc.genreconference papers and proceedings
dc.genrepostprints
dc.identifierdoi:10.13016/m2exdj-rr5k
dc.identifier.urihttp://hdl.handle.net/11603/34886
dc.language.isoen_US
dc.relation.isAvailableAtThe University of Maryland, Baltimore County (UMBC)
dc.relation.ispartofUMBC Faculty Collection
dc.relation.ispartofUMBC Computer Science and Electrical Engineering Department
dc.rightsATTRIBUTION-NONCOMMERCIAL-NODERIVS 4.0 INTERNATIONAL
dc.rights.urihttps://creativecommons.org/licenses/by-nc-nd/4.0/
dc.subjectComputer Science - Computer Vision and Pattern Recognition
dc.titleGrounding Stylistic Domain Generalization with Quantitative Domain Shift Measures and Synthetic Scene Images
dc.typeText
dcterms.creatorhttps://orcid.org/0000-0002-5593-2804

Files

Original bundle

Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
2405.15961v1.pdf
Size:
2.62 MB
Format:
Adobe Portable Document Format