Vision beyond Pixels: Visual Reasoning via Blocksworld Abstractions
| dc.contributor.author | Gokhale, Tejas | |
| dc.date.accessioned | 2025-06-05T14:03:18Z | |
| dc.date.available | 2025-06-05T14:03:18Z | |
| dc.date.issued | 2019 | |
| dc.description | Proceedings of the Twenty-Eighth International Joint Conference on Artificial Intelligence | |
| dc.description.abstract | Deep neural networks trained in an end-to-end fashion have brought about exceptional advances in computer vision, especially in computational perception. We go beyond perception and seek to enable vision modules to reason about perceived visual entities such as scenes, objects and actions. We introduce a challenging visual reasoning task, Image-Based Event Sequencing (IES) and compile the first IES dataset, Blocksworld Image Reasoning Dataset (BIRD). Motivated by the blocksworld concept, we propose a modular approach supported by literature in cognitive psychology and children’s development. We decompose the problem into two stages - visual perception and event sequencing, and show that our approach can be extended to natural images without re-training | |
| dc.description.uri | https://www.ijcai.org/proceedings/2019/907 | |
| dc.format.extent | 2 pages | |
| dc.genre | conference papers and proceedings | |
| dc.identifier | doi:10.13016/m2vosg-rxl8 | |
| dc.identifier.citation | Gokhale, Tejas. “Vision beyond Pixels: Visual Reasoning via Blocksworld Abstractions,” IJCAI, 2019, 6436–37. https://www.ijcai.org/proceedings/2019/907 | |
| dc.identifier.uri | https://doi.org/10.24963/ijcai.2019/907 | |
| dc.identifier.uri | http://hdl.handle.net/11603/38686 | |
| dc.language.iso | en_US | |
| dc.publisher | IJCAI | |
| dc.relation.isAvailableAt | The University of Maryland, Baltimore County (UMBC) | |
| dc.relation.ispartof | UMBC Computer Science and Electrical Engineering Department | |
| dc.rights | This item is likely protected under Title 17 of the U.S. Copyright Law. Unless on a Creative Commons license, for uses protected by Copyright Law, contact the copyright holder or the author. | |
| dc.title | Vision beyond Pixels: Visual Reasoning via Blocksworld Abstractions | |
| dc.type | Text | |
| dcterms.creator | https://orcid.org/0000-0002-5593-2804 |
