Towards Efficient Deep Learning Models for Facial Expression Recognition using Transformers

Safavi, Farshad; Patel, Kulin; Vinjamuri, Ramana

Towards Efficient Deep Learning Models for Facial Expression Recognition using Transformers

dc.contributor.author	Safavi, Farshad
dc.contributor.author	Patel, Kulin
dc.contributor.author	Vinjamuri, Ramana
dc.date.accessioned	2024-01-12T13:13:10Z
dc.date.available	2024-01-12T13:13:10Z
dc.date.issued	2023-12-01
dc.description.abstract	Facial expression recognition (FER) is crucial in various healthcare applications, including pain assessment, mental disorder diagnosis, and assistive robots that require close interaction with humans. While heavyweight deep learning models can achieve high accuracy for FER, their computational cost and memory consumption often need optimization for portable and mobile devices. Therefore, efficient deep learning models with high accuracy are essential to enable FER on resource-constrained platforms. This paper presents a new efficient deep-learning model for facial expression recognition. The model utilizes Mix Transformer (MiT) blocks, adopted from the SegFormer architecture, along with a supplemented fusion block. The efficient self-attention mechanism in the transformer focuses on relevant information for classifying different facial expressions while significantly improving efficiency. Furthermore, our supplemented fusion block integrates multiscale feature maps to capture both fine-grained and coarse features. Experimental results demonstrate that the proposed model significantly reduces the computational cost, latency, and the number of learnable parameters while achieving high accuracy compared with the previous state-of-the-art (SOTA) on the FER2013 dataset.
dc.description.sponsorship	This research was supported by the National Science Foundation (CAREER Award HCC-2053498).
dc.description.uri	https://ieeexplore.ieee.org/document/10331041
dc.format.extent	4 pages
dc.genre	journal articles
dc.genre	postprints
dc.identifier.citation	Safavi, Farshad, Kulin Patel, and Ramana Kumar Vinjamuri. “Towards Efficient Deep Learning Models for Facial Expression Recognition Using Transformers.” In 2023 IEEE 19th International Conference on Body Sensor Networks (BSN), 1–4, 2023. https://doi.org/10.1109/BSN58485.2023.10331041.
dc.identifier.uri	https://doi.org/10.1109/BSN58485.2023.10331041
dc.identifier.uri	http://hdl.handle.net/11603/31278
dc.language.iso	en_US
dc.publisher	IEEE
dc.relation.isAvailableAt	The University of Maryland, Baltimore County (UMBC)
dc.relation.ispartof	UMBC Computer Science and Electrical Engineering Department Collection
dc.relation.ispartof	UMBC Faculty Collection
dc.relation.ispartof	UMBC Student Collection
dc.rights	© 2023 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works.
dc.title	Towards Efficient Deep Learning Models for Facial Expression Recognition using Transformers
dc.type	Text
dcterms.creator	https://orcid.org/0000-0003-1650-5524

Files

Original bundle

Now showing 1 - 1 of 1

Name:: EfficientMethodsFER_8_14_2023.pdf
Size:: 447.74 KB
Format:: Adobe Portable Document Format

Download

License bundle

Now showing 1 - 1 of 1

Name:: license.txt
Size:: 2.56 KB
Format:: Item-specific license agreed upon to submission
Description:

Download

Collections

UMBC Computer Science and Electrical Engineering Department
UMBC Faculty Collection
UMBC Student Collection