Browsing by Subject "data quality"
Now showing 1 - 2 of 2
Results Per Page
Sort Options
Item Secure and Flexible Information Sharing in Federated Systems(2020-01-01) Oni, Samson; Chen, Zhiyuan; Information Systems; Information SystemsAs data grows exponentially, the need for information sharing to make meaningful decisions is on the increase. Multiple organizations leading different roles in dynamic situations often need to share mission-dependent data securely. Examples include contact tracing for a contagious disease such as COVID-19, maritime search and rescue operations, or creating a collaborative bid for a contract. In such examples, the ability to access data may need to change dynamically, depending on the situation of a mission (e.g., whether a person tested positive for a disease, a ship is in distress, or a bid-offer with given properties needs to be created). Also, there is a need to know how reliable data from each source is and improve the trust of multiple information by automatically detecting discrepancies among data from multiple sources. The current work does not address such a dynamic environment. In addition, most current work concerns data cleaning tools that use rules to detect data quality issues and most of these tools can only detect non-semantic issues. I have made two major contributions in this dissertations. The first is a framework to enable situation-aware access control in a federated Data-as-a-Service architecture by using Semantic Web technologies. This framework allows distributed query rewriting and semantic reasoning that automatically adds situation-based constraints to ensure that users can only see results they can access. We validate the framework by applying it to two dynamic use cases: maritime search and rescue operations and contact tracing for surveillance of a contagious disease. The experimental results of two use cases of SAR mission and Covid-19 demonstrate the effectiveness of the proposed framework. Organizations that need to share sensitive data sets securely during dynamic, limited duration scenarios can quickly adopt the framework. A second contribution is a novel approach for semantic discrepancy detection on data from multiple sources when some of the sources have structured data and other sources contain unstructured data. The proposed approach has three novel aspects: 1) a transfer learning method to extract information as entities from unstructured data automatically;2) algorithms to automatically match extracted entities with column data of other data sources; 3) algorithms to automatically detect semantic disparity between matched entities and column data across data sources. The experimental results on two events data sets and a disaster relief data set demonstrate the effectiveness and efficiency of the proposed methods. Multiple organizations collaborating to share data can quickly use the proposed methods.Item Understanding the Various Perspectives of Earth Science Observational Data Uncertainty(2019-08-11) Moroni, David F.; Ramapriyan, Hampapuram; Peng, Ge; Hobbs, Jonathan; Goldstein, Justin C.; Downs, Robert R.; Wolfe, Robert; Shie, Chung-Lin; Merchant, Christopher J.; Bourassa, Mark; Matthews, Jessica L.; Cornillon, Peter; Bastin, Lucy; Kehoe, Kenneth; Smith, Benjamin; Privette, Jeffrey L.; Subramanian, Aneesh C.; Brown, Otis; Ivánová, IvanaInformation about the uncertainty associated with Earth science observational data is fundamental to use, re-use, and overall evaluation of the data being used to produce science and support decision making. The associated uncertainty information leads to a quantifiable level of confidence in both the data and the science informing decisions produced using the data. The current breadth and cross-domain depth of understanding and application of uncertainty information, however, are still evolving as the practices associated with quantifying and characterizing uncertainty across various types of Earth observation data are diverse. Since its re-establishment in 2015, the Information Quality Cluster (IQC) of the Earth Science Information Partners (ESIP) has convened numerous sessions within the auspices of ESIP and the American Geophysical Union (AGU) to help collect expert-level information focusing on key aspects of uncertainty of Earth science data and addressed key concerns such as: 1) how uncertainty is quantified (UQ) and characterized (UC), 2) understanding the strengths and limitations of common techniques used in producing and evaluating uncertainty information, 3) implications using uncertainty information as a quality indicator 4) impacts of uncertainty on data fusion/assimilation, 5) various methods for documenting and conveying the uncertainty information to data users, and 6) understanding why certain user communities care about uncertainty and others do not. A key recommendation and action item from the ESIP Summer Meeting 2017 was for the IQC to develop a white paper to establish a clearer understanding of the concept of uncertainty and its communication to data users. The information gathered for this white paper has been provided by Earth science data and informatics experts spanning diverse disciplines and observation systems in the cross-domain Earth sciences. The intention of this white paper is to provide a diversely sampled exposition of both prolific and unique policies and practices, applicable in an international context of diverse policies and working groups, made toward quantifying, characterizing, communicating and making use of uncertainty information throughout the diverse, cross-disciplinary Earth science data landscape.