Person Re-Identification using Vision Transformer with Auxiliary Tokens

dc.contributor.advisorChapman, David
dc.contributor.authorSharma, Charu
dc.contributor.departmentComputer Science and Electrical Engineering
dc.contributor.programComputer Science
dc.date.accessioned2022-09-29T15:37:44Z
dc.date.available2022-09-29T15:37:44Z
dc.date.issued2021-01-01
dc.description.abstractPerson Re-Identification (re-ID) is an object re-ID problem that aims to re-identify a person by finding an association between the images of a person captured by multiple cameras. Due to its foundational role in computer-vision based video surveillance applications, it is vital to generate a robust feature embedding to represent a person. CNN-based methods are known for their feature learning abilities, and for many years were a prime choice for a person re-ID. In this theses, we explore a method that takes advantage of auxiliary local tokens and the global tokens of the vision transformer to generate the final feature embedding. We also propose a novel blockwise fine-tuning technique that improves the performance of the Vision Transformer. Our model trained with blockwise fine-tuning achieves $96.6$ rank-1 accuracy and $90.3$ mAP score on the Market-1501 dataset. On the CUHK-03 dataset, it achieves $97.5$ rank-1 accuracy and a $95.03$ mAP score. These performances are comparable to many recently published methods for this problem.
dc.formatapplication:pdf
dc.genretheses
dc.identifierdoi:10.13016/m2m12t-wohl
dc.identifier.other12389
dc.identifier.urihttp://hdl.handle.net/11603/25960
dc.languageen
dc.relation.isAvailableAtThe University of Maryland, Baltimore County (UMBC)
dc.relation.ispartofUMBC Computer Science and Electrical Engineering Department Collection
dc.relation.ispartofUMBC Theses and Dissertations Collection
dc.relation.ispartofUMBC Graduate School Collection
dc.relation.ispartofUMBC Student Collection
dc.rightsThis item may be protected under Title 17 of the U.S. Copyright Law. It is made available by UMBC for non-commercial research and education. For permission to publish or reproduce, please see http://aok.lib.umbc.edu/specoll/repro.php or contact Special Collections at speccoll(at)umbc.edu
dc.sourceOriginal File Name: Sharma_umbc_0434M_12389.pdf
dc.subjectComputer Vision
dc.subjectMulti camera trekking
dc.subjectPattern Recognition
dc.subjectPerson Re-identification
dc.titlePerson Re-Identification using Vision Transformer with Auxiliary Tokens
dc.typeText
dcterms.accessRightsAccess limited to the UMBC community. Item may possibly be obtained via Interlibrary Loan through a local library, pending author/copyright holder's permission.
dcterms.accessRightsAccess limited to the UMBC community. Item may possibly be obtained via Interlibrary Loan thorugh a local library, pending author/copyright holder's permission.

Files

Original bundle

Now showing 1 - 2 of 2
No Thumbnail Available
Name:
Sharma_umbc_0434M_12389.pdf
Size:
2.49 MB
Format:
Adobe Portable Document Format
No Thumbnail Available
Name:
SharmaThesis_final.pdf
Size:
1.69 MB
Format:
Adobe Portable Document Format