A Semantically Rich Framework for Knowledge Representation of Code of Federal Regulations
Files
Links to Files
Permanent Link
Author/Creator
Author/Creator ORCID
Date
Type of Work
Department
Program
Citation of Original Publication
Karuna pande Joshi and Srishty Saha. 2020. A Semantically Rich Framework for Knowledge Representation of Code of Federal Regulations. Digit. Gov.: Res. Pract. 1, 3, Article 21 (November 2020), 17 pages. https://doi.org/10.1145/3425192
Rights
This item is likely protected under Title 17 of the U.S. Copyright Law. Unless on a Creative Commons license, for uses protected by Copyright Law, contact the copyright holder or the author.
Attribution 4.0 International
Attribution 4.0 International
Subjects
Abstract
Federal government agencies and organizations doing business with them have to adhere to the Code of Federal Regulations (CFR). The CFRs are currently available as large text documents that are not machine processable and so require extensive manual effort to parse and comprehend, especially when sections cross-reference topics spread across various titles. We have developed a novel framework to automatically extract knowledge from CFRs and represent it using a semantically rich knowledge graph. The framework captures knowledge in the form of key terms, rules, topic summaries, relationships between various terms, semantically similar terminologies, deontic expressions, and cross-referenced facts and rules. We built our framework using deep learning technologies like TensorFlow for word embeddings and text summarization, Gensim for topic modeling, and Semantic Web technologies for building the knowledge graph. In this article, we describe our framework in detail and present the results of our analysis of the Title 48 CFR knowledge base that we have built using this framework. Our framework and knowledge graph can be adopted by federal agencies and businesses to automate their internal processes that reference the CFR rules and policies.
