Comparative Analysis of SoftMax Vs. GMM for Semi-supervised Deep Learning

Author/Creator ORCID

Date

2022-01-01

Department

Computer Science and Electrical Engineering

Program

Computer Science

Citation of Original Publication

Rights

Distribution Rights granted to UMBC by the author.
This item may be protected under Title 17 of the U.S. Copyright Law. It is made available by UMBC for non-commercial research and education. For permission to publish or reproduce, please see http://aok.lib.umbc.edu/specoll/repro.php or contact Special Collections at speccoll(at)umbc.edu

Abstract

This paper presents a new pseudo-labeling approach, using the Multi-variate Gaussian Mixture Model to learn the latent feature space distributions of labeled samples from a Deep Neural Network. Then, these derived Gaussian distributions are used to predict the labels for unlabeled samples. Unlike most studies, which solely rely on methods similar to SoftMax-based classification for pseudo-labeling, our method fits Gaussian clusters to the latent feature representations. It considers the probability of a latent feature vector to be part of a particular class's Gaussian clusters to generate the pseudo-label of unlabeled data points. The proposed approach is compared with the standard baseline, the traditional way of using SoftMax to predict labels from logits. Empirical results show competitive performance against the baseline, specifically with shallow labeled samples. Additionally, this study reveals that GMM's ability to interpret embedded feature space distributions with a handful of labeled data points is superior to SoftMax's.