Secure and Flexible Information Sharing in Federated Systems

Author/Creator

Author/Creator ORCID

Date

2020-01-01

Department

Information Systems

Program

Information Systems

Citation of Original Publication

Rights

Access limited to the UMBC community. Item may possibly be obtained via Interlibrary Loan through a local library, pending author/copyright holder's permission.
This item may be protected under Title 17 of the U.S. Copyright Law. It is made available by UMBC for non-commercial research and education. For permission to publish or reproduce, please see http://aok.lib.umbc.edu/specoll/repro.php or contact Special Collections at speccoll(at)umbc.edu

Abstract

As data grows exponentially, the need for information sharing to make meaningful decisions is on the increase. Multiple organizations leading different roles in dynamic situations often need to share mission-dependent data securely. Examples include contact tracing for a contagious disease such as COVID-19, maritime search and rescue operations, or creating a collaborative bid for a contract. In such examples, the ability to access data may need to change dynamically, depending on the situation of a mission (e.g., whether a person tested positive for a disease, a ship is in distress, or a bid-offer with given properties needs to be created). Also, there is a need to know how reliable data from each source is and improve the trust of multiple information by automatically detecting discrepancies among data from multiple sources. The current work does not address such a dynamic environment. In addition, most current work concerns data cleaning tools that use rules to detect data quality issues and most of these tools can only detect non-semantic issues. I have made two major contributions in this dissertations. The first is a framework to enable situation-aware access control in a federated Data-as-a-Service architecture by using Semantic Web technologies. This framework allows distributed query rewriting and semantic reasoning that automatically adds situation-based constraints to ensure that users can only see results they can access. We validate the framework by applying it to two dynamic use cases: maritime search and rescue operations and contact tracing for surveillance of a contagious disease. The experimental results of two use cases of SAR mission and Covid-19 demonstrate the effectiveness of the proposed framework. Organizations that need to share sensitive data sets securely during dynamic, limited duration scenarios can quickly adopt the framework. A second contribution is a novel approach for semantic discrepancy detection on data from multiple sources when some of the sources have structured data and other sources contain unstructured data. The proposed approach has three novel aspects: 1) a transfer learning method to extract information as entities from unstructured data automatically;2) algorithms to automatically match extracted entities with column data of other data sources; 3) algorithms to automatically detect semantic disparity between matched entities and column data across data sources. The experimental results on two events data sets and a disaster relief data set demonstrate the effectiveness and efficiency of the proposed methods. Multiple organizations collaborating to share data can quickly use the proposed methods.