Towards Efficient Deep Learning Models for Facial Expression Recognition using Transformers

dc.contributor.authorSafavi, Farshad
dc.contributor.authorPatel, Kulin
dc.contributor.authorVinjamuri, Ramana
dc.date.accessioned2024-01-12T13:13:10Z
dc.date.available2024-01-12T13:13:10Z
dc.date.issued2023-12-01
dc.description.abstractFacial expression recognition (FER) is crucial in various healthcare applications, including pain assessment, mental disorder diagnosis, and assistive robots that require close interaction with humans. While heavyweight deep learning models can achieve high accuracy for FER, their computational cost and memory consumption often need optimization for portable and mobile devices. Therefore, efficient deep learning models with high accuracy are essential to enable FER on resource-constrained platforms. This paper presents a new efficient deep-learning model for facial expression recognition. The model utilizes Mix Transformer (MiT) blocks, adopted from the SegFormer architecture, along with a supplemented fusion block. The efficient self-attention mechanism in the transformer focuses on relevant information for classifying different facial expressions while significantly improving efficiency. Furthermore, our supplemented fusion block integrates multiscale feature maps to capture both fine-grained and coarse features. Experimental results demonstrate that the proposed model significantly reduces the computational cost, latency, and the number of learnable parameters while achieving high accuracy compared with the previous state-of-the-art (SOTA) on the FER2013 dataset.
dc.description.sponsorshipThis research was supported by the National Science Foundation (CAREER Award HCC-2053498).
dc.description.urihttps://ieeexplore.ieee.org/document/10331041
dc.format.extent4 pages
dc.genrejournal articles
dc.genrepostprints
dc.identifier.citationSafavi, Farshad, Kulin Patel, and Ramana Kumar Vinjamuri. “Towards Efficient Deep Learning Models for Facial Expression Recognition Using Transformers.” In 2023 IEEE 19th International Conference on Body Sensor Networks (BSN), 1–4, 2023. https://doi.org/10.1109/BSN58485.2023.10331041.
dc.identifier.urihttps://doi.org/10.1109/BSN58485.2023.10331041
dc.identifier.urihttp://hdl.handle.net/11603/31278
dc.language.isoen_US
dc.publisherIEEE
dc.relation.isAvailableAtThe University of Maryland, Baltimore County (UMBC)
dc.relation.ispartofUMBC Computer Science and Electrical Engineering Department Collection
dc.relation.ispartofUMBC Faculty Collection
dc.relation.ispartofUMBC Student Collection
dc.rights© 2023 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works.
dc.titleTowards Efficient Deep Learning Models for Facial Expression Recognition using Transformers
dc.typeText
dcterms.creatorhttps://orcid.org/0000-0003-1650-5524

Files

Original bundle

Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
EfficientMethodsFER_8_14_2023.pdf
Size:
447.74 KB
Format:
Adobe Portable Document Format

License bundle

Now showing 1 - 1 of 1
No Thumbnail Available
Name:
license.txt
Size:
2.56 KB
Format:
Item-specific license agreed upon to submission
Description: