Unsupervised Domain Adaptation for Action Recognition via Self-Ensembling and Conditional Embedding Alignment

dc.contributor.authorGhosh, Indrajeet
dc.contributor.authorChugh, Garvit
dc.contributor.authorFaridee, Abu Zaher Md
dc.contributor.authorRoy, Nirmalya
dc.date.accessioned2024-12-11T17:02:40Z
dc.date.available2024-12-11T17:02:40Z
dc.date.issued2024-10-23
dc.description.abstractRecent advancements in deep learning-based wearable human action recognition (wHAR) have improved the capture and classification of complex motions, but adoption remains limited due to the lack of expert annotations and domain discrepancies from user variations. Limited annotations hinder the model's ability to generalize to out-of-distribution samples. While data augmentation can improve generalizability, unsupervised augmentation techniques must be applied carefully to avoid introducing noise. Unsupervised domain adaptation (UDA) addresses domain discrepancies by aligning conditional distributions with labeled target samples, but vanilla pseudo-labeling can lead to error propagation. To address these challenges, we propose μDAR, a novel joint optimization architecture comprised of three functions: (i) consistency regularizer between augmented samples to improve model classification generalizability, (ii) temporal ensemble for robust pseudo-label generation and (iii) conditional distribution alignment to improve domain generalizability. The temporal ensemble works by aggregating predictions from past epochs to smooth out noisy pseudo-label predictions, which are then used in the conditional distribution alignment module to minimize kernel-based class-wise conditional maximum mean discrepancy (kCMMD) between the source and target feature space to learn a domain invariant embedding. The consistency-regularized augmentations ensure that multiple augmentations of the same sample share the same labels; this results in (a) strong generalization with limited source domain samples and (b) consistent pseudo-label generation in target samples. The novel integration of these three modules in μDAR results in a range of ≈4-12% average macro-F1 score improvement over six state-of-the-art UDA methods in four benchmark wHAR datasets
dc.description.sponsorshipThis work has been partially supported by NSF CAREER Award #1750936, ONR Grant #N00014-23-1-2119, U.S. Army Grant #W911NF2120076 and NSF CNS EAGER Grant #2233879.
dc.description.urihttp://arxiv.org/abs/2410.17489
dc.format.extent6 pages
dc.genrejournal articles
dc.genrepreprints
dc.identifierdoi:10.13016/m2yqmf-ceqo
dc.identifier.urihttps://doi.org/10.48550/arXiv.2410.17489
dc.identifier.urihttp://hdl.handle.net/11603/37092
dc.language.isoen_US
dc.relation.isAvailableAtThe University of Maryland, Baltimore County (UMBC)
dc.relation.ispartofUMBC Student Collection
dc.relation.ispartofUMBC Information Systems Department
dc.relation.ispartofUMBC Center for Real-time Distributed Sensing and Autonomy
dc.relation.ispartofUMBC Faculty Collection
dc.rightsAttribution 4.0 International CC BY 4.0
dc.rights.urihttps://creativecommons.org/licenses/by/4.0/
dc.subjectComputer Science - Artificial Intelligence
dc.subjectComputer Science - Computer Vision and Pattern Recognition
dc.subjectUMBC Mobile, Pervasive and Sensor Computing Lab (MPSC Lab)
dc.titleUnsupervised Domain Adaptation for Action Recognition via Self-Ensembling and Conditional Embedding Alignment
dc.typeText
dcterms.creatorhttps://orcid.org/0000-0003-2868-3766

Files

Original bundle

Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
2410.17489v1.pdf
Size:
3.26 MB
Format:
Adobe Portable Document Format