Cluster-Based Join for Geographically Distributed Big RDF Data

dc.contributor.authorYang, Fan
dc.contributor.authorCrainiceanu, Adina
dc.contributor.authorChen, Zhiyuan
dc.contributor.authorNeedham, Don
dc.date.accessioned2019-10-11T14:42:42Z
dc.date.available2019-10-11T14:42:42Z
dc.date.issued2019-08-29
dc.description2019 IEEE International Congress on Big Data (BigDataCongress)en_US
dc.description.abstractFederated RDF systems allow users to retrieve data from multiple independent sources without needing to have all the data in the same triple store. The performance of these systems can be poor for large and geographically distributed RDF data where network transfer costs are high. This paper introduces CBTP, a novel join algorithm that takes advantage of network topology to decrease the cost of processing SPARQL queries in a geographically distributed environment. Federation members are grouped in clusters, based on the network communication cost between the members, and the bulk of the join processing is pushed to the clusters. We use an overlap list to efficiently compute join results from triples in different clusters. We implement our algorithms in OpenRDF Sesame federated framework and use Apache Rya triple store instances as federation members. Experimental evaluation results show the advantages of our approach over existing techniques.en_US
dc.format.extent9 pagesen_US
dc.genreconference papers and proceedingsen_US
dc.identifierdoi:10.13016/m2m7s5-kl5n
dc.identifier.citationF. Yang, A. Crainiceanu, Z. Chen and D. Needham, "Cluster-Based Join for Geographically Distributed Big RDF Data," 2019 IEEE International Congress on Big Data (BigDataCongress), Milan, Italy, 2019, pp. 170-178. doi: 10.1109/BigDataCongress.2019.00037; URL: http://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=8818183&isnumber=8818170en_US
dc.identifier.urihttps://doi.org/10.1109/BigDataCongress.2019.00037
dc.identifier.urihttp://hdl.handle.net/11603/15856
dc.language.isoen_USen_US
dc.publisherIEEEen_US
dc.relation.isAvailableAtThe University of Maryland, Baltimore County (UMBC)
dc.relation.ispartofUMBC Media and Communication Studies Department
dc.relation.ispartofUMBC Faculty Collection
dc.rightsThis item is likely protected under Title 17 of the U.S. Copyright Law. Unless on a Creative Commons license, for uses protected by Copyright Law, contact the copyright holder or the author.
dc.rightsPublic Domain Mark 1.0*
dc.rightsThis work was written as part of one of the author's official duties as an Employee of the United States Government and is therefore a work of the United States Government. In accordance with 17 U.S.C. 105, no copyright protection is available for such works under U.S. Law.
dc.rights.urihttp://creativecommons.org/publicdomain/mark/1.0/*
dc.subjectSPARQLen_US
dc.subjectClusteren_US
dc.subjectJoinen_US
dc.subjectFederated Queriesen_US
dc.titleCluster-Based Join for Geographically Distributed Big RDF Dataen_US
dc.typeTexten_US

Files

Original bundle
Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
Cluster-Based Join for Geographically Distributed Big RDF Data.pdf
Size:
179.06 KB
Format:
Adobe Portable Document Format
Description:
License bundle
Now showing 1 - 1 of 1
No Thumbnail Available
Name:
license.txt
Size:
2.56 KB
Format:
Item-specific license agreed upon to submission
Description: