Cognitive Visual Commonsense Reasoning Using Dynamic Working Memory

dc.contributor.authorTang, Xuejiao
dc.contributor.authorHuang, Xin
dc.contributor.authorZhang, Wenbin
dc.contributor.authorChild, Travers B.
dc.contributor.authorHu, Qiong
dc.contributor.authorLiu, Zhen
dc.contributor.authorZhang, Ji
dc.date.accessioned2021-07-23T21:05:07Z
dc.date.available2021-07-23T21:05:07Z
dc.date.issued2021-09-27
dc.descriptionThe 23rd International Conference on Big Data Analytics and Knowledge Discovery (DaWaK2021), September 27-30, 2021 - Linz, Austriaen_US
dc.description.abstractVisual Commonsense Reasoning (VCR) predicts an answer with corresponding rationale, given a question-image input. VCR is a recently introduced visual scene understanding task with a wide range of applications, including visual question answering, automated vehicle systems, and clinical decision support. Previous approaches to solving the VCR task generally rely on pre-training or exploiting memory with long dependency relationship encoded models. However, these approaches suffer from a lack of generalizability and prior knowledge. In this paper we propose a dynamic working memory based cognitive VCR network, which stores accumulated commonsense between sentences to provide prior knowledge for inference. Extensive experiments show that the proposed model yields significant improvements over existing methods on the benchmark VCR dataset. Moreover, the proposed model provides intuitive interpretation into visual commonsense reasoning. A Python implementation of our mechanism is publicly available at https://github.com/tanjatang/DMVCRen_US
dc.format.extent12 pagesen_US
dc.genreconference papers and proceedings preprintsen_US
dc.identifierdoi:10.13016/m2vxku-ci8a
dc.identifier.citationTang, Xuejiao et al.; Cognitive Visual Commonsense Reasoning Using Dynamic Working Memory; The 23rd International Conference on Big Data Analytics and Knowledge Discovery (DaWaK2021), September 27, 2021;en_US
dc.identifier.urihttp://hdl.handle.net/11603/22074
dc.language.isoen_USen_US
dc.relation.isAvailableAtThe University of Maryland, Baltimore County (UMBC)
dc.relation.ispartofUMBC Information Systems Department Collection
dc.relation.ispartofUMBC Student Collection
dc.rightsThis item is likely protected under Title 17 of the U.S. Copyright Law. Unless on a Creative Commons license, for uses protected by Copyright Law, contact the copyright holder or the author.
dc.titleCognitive Visual Commonsense Reasoning Using Dynamic Working Memoryen_US
dc.typeTexten_US

Files

Original bundle

Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
DaWaK21 (1).pdf
Size:
1.03 MB
Format:
Adobe Portable Document Format
Description:

License bundle

Now showing 1 - 1 of 1
No Thumbnail Available
Name:
license.txt
Size:
2.56 KB
Format:
Item-specific license agreed upon to submission
Description: