Locally Aware Transformer for Person Re-Identification

Author/Creator

Author/Creator ORCID

Date

2021-01-01

Department

Computer Science and Electrical Engineering

Program

Computer Science

Citation of Original Publication

Rights

This item may be protected under Title 17 of the U.S. Copyright Law. It is made available by UMBC for non-commercial research and education. For permission to publish or reproduce, please see http://aok.lib.umbc.edu/specoll/repro.php or contact Special Collections at speccoll(at)umbc.edu
Distribution Rights granted to UMBC by the author.
Access limited to the UMBC community. Item may possibly be obtained via Interlibrary Loan thorugh a local library, pending author/copyright holder's permission.

Abstract

Person Re-Identification is an important problem in computer vision-basedsurveillance applications, in which the same person is attempted to be identifiedfrom surveillance photographs in a variety of nearby zones. At present, the major-ity of Person re-ID techniques are based on Convolutional Neural Networks (CNNs),but Vision Transformers are beginning to displace pure CNNs for a variety of objectrecognition tasks. The primary output of a vision transformer is a global classifica-tion token, but vision transformers also yield local tokens which contain additionalinformation about local regions of the image. Techniques to make use of these localtokens to improve classification accuracy are an active area of research. We proposea novel Locally Aware Transformer (LA-Transformer) that employs a Parts-basedConvolution Baseline (PCB)-inspired strategy for aggregating globally enhancedlocal classification tokens into an ensemble of?Nclassifiers, whereNis the num-ber of patches. LA-Transformer achieves rank-1 accuracy of 98.27% with standarddeviation of 0.13 on the Market-1501 and 98.7% with standard deviation of 0.2 onthe CUHK03 dataset respectively, outperforming all other state-of-the-art.