TDLR: Top (Semantic)-Down (Syntactic) Language Representation
Loading...
Permanent Link
Author/Creator ORCID
Date
2022
Type of Work
Department
Program
Citation of Original Publication
Rights
This item is likely protected under Title 17 of the U.S. Copyright Law. Unless on a Creative Commons license, for uses protected by Copyright Law, contact the copyright holder or the author.
Subjects
Abstract
Language understanding involves processing text with both the grammatical and
2 common-sense contexts of the text fragments. The text “I went to the grocery store
3 and brought home a car” requires both the grammatical context (syntactic) and
4 common-sense context (semantic) to capture the oddity in the sentence. Contex5 tualized text representations learned by Language Models (LMs) are expected to
6 capture a variety of syntactic and semantic contexts from large amounts of training
7 data corpora. Recent work such as ERNIE has shown that infusing the knowl8 edge contexts, where they are available in LMs, results in significant performance
9 gains on General Language Understanding (GLUE) benchmark tasks. However,
10 to our knowledge, no knowledge-aware model has attempted to infuse knowledge
11 through top-down semantics-driven syntactic processing (Eg: Common-sense to
12 Grammatical) and directly operated on the attention mechanism that LMs leverage
13 to learn the data context. We propose a learning framework Top-Down Language
14 Representation (TDLR) to infuse common-sense semantics into LMs. In our
15 implementation, we build on BERT for its rich syntactic knowledge and use the
16 knowledge graphs ConceptNet and WordNet to infuse semantic knowledge.