Gazetteer Generation for Neural Named Entity Recognition

Author/Creator ORCID

Date

2020-05-17

Department

Program

Citation of Original Publication

Song, Chan Hee; Lawrie, Dawn; Finin, Tim; Mayfield, James; Gazetteer Generation for Neural Named Entity Recognition; Proceedings of the 33rd International FLAIRS Conference (2020); https://aaai.org/ocs/index.php/FLAIRS/FLAIRS20/paper/view/18451

Rights

This item is likely protected under Title 17 of the U.S. Copyright Law. Unless on a Creative Commons license, for uses protected by Copyright Law, contact the copyright holder or the author.

Abstract

We present a way to generate gazetteers from the Wikidata knowledge graph and use the lists to improve a neural NER system by adding an input feature indicating that a word is part of a name in the gazetteer. We empirically show that the approach yields performance gains in two distinct languages: a high-resource, word-based language, English and a high-resource, character-based language, Chinese. We apply the approach to a low-resource language, Russian, using a new annotated Russian NER corpus from Reddit tagged with four core and eleven extended types, and show a baseline score.