Creating and Populating an IoT Knowledge Graph using Web Data

Author/Creator

Author/Creator ORCID

Date

2023-01-01

Department

Computer Science and Electrical Engineering

Program

Computer Science

Citation of Original Publication

Rights

This item may be protected under Title 17 of the U.S. Copyright Law. It is made available by UMBC for non-commercial research and education. For permission to publish or reproduce, please see http://aok.lib.umbc.edu/specoll/repro.php or contact Special Collections at speccoll(at)umbc.edu
Access limited to the UMBC community. Item may possibly be obtained via Interlibrary Loan through a local library, pending author/copyright holder's permission.
Access limited to the UMBC community. Item may possibly be obtained via Interlibrary Loan thorugh a local library, pending author/copyright holder's permission.

Subjects

Abstract

For an individual aspiring to build an IoT space, choosing the most suitable component is crucial. As there are several manufacturers of a similar product offering different features, being aware of all alternatives is challenging. Currently, there is almost no all-under-one roof catalog for IoT devices. The need for building a common platform with all the information regarding various product categories, collected from several sources, has been the motivation of this thesis. The existing catalogs are manufacturer or seller specific, and they do not have up-to-date/significant information. In this thesis, we aim to create an IoT knowledge graph populated from different sources on the Web. A scraping mechanism, capable of extracting information from thousands of pages of product information from websites, is built, and using the extracted information the Knowledge Graph (KG) is automatically populated. An ontology has been developed with various entities and properties and the inferencing ability created more relationships among the existing entities. This ontology is used to create the KG we aimed to build. The prototype enables the integration of information from other available KGs (Wikidata/DBPedia). Complex SPARQL queries are run on the RDF triples and utilizing the inference and data linking capacity of the model, encouraging results are achieved. With the continuous improvement of information extraction techniques, the addition of data, and refinement, the prototype can be developed into an essential IoT environment planning tool.