A Policy-Driven Approach to Secure Extraction of COVID-19 Data From Research Papers

dc.contributor.authorElluri, Lavanya
dc.contributor.authorPiplai, Aritran
dc.contributor.authorKotal, Anantaa
dc.contributor.authorJoshi, Anupam
dc.contributor.authorJoshi, Karuna Pande
dc.date.accessioned2022-08-18T22:33:35Z
dc.date.available2022-08-18T22:33:35Z
dc.date.issued2021-08-12
dc.description.abstractThe entire scientific and academic community has been mobilized to gain a better understanding of the COVID-19 disease and its impact on humanity. Most research related to COVID-19 needs to analyze large amounts of data in very little time. This urgency has made Big Data Analysis, and related questions around the privacy and security of the data, an extremely important part of research in the COVID-19 era. The White House OSTP has, for example, released a large dataset of papers related to COVID research from which the research community can extract knowledge and information. We show an example system with a machine learning-based knowledge extractor which draws out key medical information from COVID-19 related academic research papers. We represent this knowledge in a Knowledge Graph that uses the Unified Medical Language System (UMLS). However, publicly available studies rely on dataset that might have sensitive data. Extracting information from academic papers can potentially leak sensitive data, and protecting the security and privacy of this data is equally important. In this paper, we address the key challenges around the privacy and security of such information extraction and analysis systems. Policy regulations like HIPAA have updated the guidelines to access data, specifically, data related to COVID-19, securely. In the US, healthcare providers must also comply with the Office of Civil Rights (OCR) rules to protect data integrity in matters like plasma donation, media access to health care data, telehealth communications, etc. Privacy policies are typically short and unstructured HTML or PDF documents. We have created a framework to extract relevant knowledge from the health centers’ policy documents and also represent these as a knowledge graph. Our framework helps to understand the extent to which individual provider policies comply with regulations and define access control policies that enforce the regulation rules on data in the knowledge graph extracted from COVID-related papers. Along with being compliant, privacy policies must also be transparent and easily understood by the clients. We analyze the relative readability of healthcare privacy policies and discuss the impact. In this paper, we develop a framework for access control decisions that uses policy compliance information to securely retrieve COVID data. We show how policy compliance information can be used to restrict access to COVID-19 data and information extracted from research papers.en_US
dc.description.urihttps://www.frontiersin.org/articles/10.3389/fdata.2021.701966/fullen_US
dc.format.extent13 pagesen_US
dc.genrejournal articlesen_US
dc.identifierdoi:10.13016/m2ujay-selg
dc.identifier.citationElluri L, Piplai A, Kotal A, Joshi A and Joshi KP (2021) A Policy-Driven Approach to Secure Extraction of COVID-19 Data From Research Papers. Front. Big Data 4:701966.doi: 10.3389/fdata.2021.701966en_US
dc.identifier.urihttps://doi.org/10.3389/fdata.2021.701966
dc.identifier.urihttp://hdl.handle.net/11603/25500
dc.language.isoen_USen_US
dc.publisherFrontiersen_US
dc.relation.isAvailableAtThe University of Maryland, Baltimore County (UMBC)
dc.relation.ispartofUMBC Information Systems Department Collection
dc.rightsThis item is likely protected under Title 17 of the U.S. Copyright Law. Unless on a Creative Commons license, for uses protected by Copyright Law, contact the copyright holder or the author.en_US
dc.rightsAttribution 4.0 International (CC BY 4.0)*
dc.rights.urihttps://creativecommons.org/licenses/by/4.0/*
dc.subjectUMBC Ebiquity Research Groupen_US
dc.titleA Policy-Driven Approach to Secure Extraction of COVID-19 Data From Research Papersen_US
dc.typeTexten_US
dcterms.creatorhttps://orcid.org/0000-0002-8641-3193en_US
dcterms.creatorhttps://orcid.org/0000-0002-6354-1686en_US

Files

Original bundle

Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
fdata-04-701966.pdf
Size:
2.84 MB
Format:
Adobe Portable Document Format
Description:

License bundle

Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
license.txt
Size:
2.56 KB
Format:
Item-specific license agreed upon to submission
Description: