A SEMANTICALLY RICH FRAMEWORK TO ENABLE REAL-TIME KNOWLEDGE EXTRACTION AND CLASSIFICATION FROM SHORT LENGTH SEMI-STRUCTURED DOCUMENTS

dc.contributor.advisorJoshi, Karuna Pande
dc.contributor.authorElluri, Lavanya
dc.contributor.departmentInformation Systems
dc.contributor.programInformation Systems
dc.date.accessioned2022-09-29T15:38:07Z
dc.date.available2022-09-29T15:38:07Z
dc.date.issued2021-01-01
dc.description.abstractRegulatory bodies have power or control in a domain or sphere that they monitor and administrate. To ensure the smooth and secure operation of their sphere, authorities formulate policies and rules governing the domain which the other organizations and individuals, operating in that sphere, must comply with. The knowledge about the Authority's policies and rules is typically maintained as a large volume of unstructured text data in books, laws, and regulations, academic and scientific reports, etc. Most of these text documents are often not machine-processable. Hence it is hard to find relevant information from these texts quickly. Extracting and categorizing knowledge from the text of these numerous authority documents requires significant manual effort and time and organizations often spend significant resources in complying with the authority controls. Organizations that adhere to the authority policies, often refer to short sections of the authority's documents in the documents they create for their internal consumption or for their clients. However, these short sections in the referring documents do not include the full context of that section in the authority document. Thus, a person relying on the referring document must manually reference the authority's document to determine the complete context of the authority. As both documents are not machine-processable, it is difficult to determine the context of the referring section in real-time.We propose a semantically rich framework to extract and classify the context of a short text in real-time, to help enable users that update their referential documents regularly based on the authority documents. An open challenge that we will address is automated text classification and identifying context from short text documents. Additionally, we will also populate the knowledge extracted from the authority and the referencing documents in the knowledge graphs. We use techniques from Semantic Web, Natural Language Processing, Machine Learning, and Deep Learning to build this framework. Our objectives include representing Knowledge in Cloud compliance or legal texts to create and populate a knowledge graph based on data protection regulations.
dc.formatapplication:pdf
dc.genredissertations
dc.identifierdoi:10.13016/m21jhv-0ib0
dc.identifier.other12492
dc.identifier.urihttp://hdl.handle.net/11603/26008
dc.languageen
dc.relation.isAvailableAtThe University of Maryland, Baltimore County (UMBC)
dc.relation.ispartofUMBC Information Systems Department Collection
dc.relation.ispartofUMBC Theses and Dissertations Collection
dc.relation.ispartofUMBC Graduate School Collection
dc.relation.ispartofUMBC Student Collection
dc.rightsThis item may be protected under Title 17 of the U.S. Copyright Law. It is made available by UMBC for non-commercial research and education. For permission to publish or reproduce, please see http://aok.lib.umbc.edu/specoll/repro.php or contact Special Collections at speccoll(at)umbc.edu
dc.sourceOriginal File Name: Elluri_umbc_0434D_12492.pdf
dc.subjectClassification
dc.subjectCloud Security
dc.subjectNatural Language Processing
dc.subjectSemantic Web
dc.titleA SEMANTICALLY RICH FRAMEWORK TO ENABLE REAL-TIME KNOWLEDGE EXTRACTION AND CLASSIFICATION FROM SHORT LENGTH SEMI-STRUCTURED DOCUMENTS
dc.typeText
dcterms.accessRightsDistribution Rights granted to UMBC by the author.
dcterms.accessRightsAccess limited to the UMBC community. Item may possibly be obtained via Interlibrary Loan thorugh a local library, pending author/copyright holder's permission.

Files

Original bundle

Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
Elluri_umbc_0434D_12492.pdf
Size:
4 MB
Format:
Adobe Portable Document Format

License bundle

Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
Elluri-Lavanya_Open.pdf
Size:
255.18 KB
Format:
Adobe Portable Document Format
Description: