Jointly Identifying and Fixing Inconsistent Readings from Information Extraction Systems
Links to Files
Author/Creator
Author/Creator ORCID
Date
Type of Work
Department
Program
Citation of Original Publication
Ankur Padia, Francis Ferraro, and Tim Finin, Jointly Identifying and Fixing Inconsistent Readings from Information Extraction Systems, 3rd Workshop on Knowledge Extraction and Integration for Deep Learning Architectures, 60th Annual Meeting of the Association for Computational Linguistics, May 2022. http://dx.doi.org/10.18653/v1/2022.deelio-1.5
Rights
This item is likely protected under Title 17 of the U.S. Copyright Law. Unless on a Creative Commons license, for uses protected by Copyright Law, contact the copyright holder or the author.
Subjects
Abstract
Information extraction systems analyze text to produce entities and beliefs, but their output often has errors. In this paper, we analyze the reading consistency of the extracted facts with respect to the text from which they were derived and show how to detect and correct errors. We consider both the scenario when the provenance text is automatically found by an information extraction system and when it is curated by humans. We contrast consistency with credibility; define and explore consistency and repair tasks; and demonstrate a simple yet effective and generalizable model. We analyze these tasks and evaluate this approach on three datasets. Against a strong baseline model, we consistently improve both consistency and repair across three datasets using a simple MLP model with attention and lexical features.
