Show simple item record

dcterms.accessRightsDistribution Rights granted to UMBC by the author.
dc.contributor.advisorOates, Tim ;
dc.contributor.departmentComputer Science and Electrical Engineering
dc.contributor.programComputer Science
dc.creatorAthley, Sushant
dc.date.accessioned2019-10-11T13:42:57Z
dc.date.available2019-10-11T13:42:57Z
dc.date.issued2017-01-01
dc.description.abstractWe evaluate the applicability of distributed language embedding techniques from the domain of natural language processing to relational data. Relational data is typically stored in SQL databases. We apply modern distributed representations of words (Tomas Mikolov 2013c) and paragraph (Quoc V. Le 2014) techniques to this structured data and attempt to unlock the potential of enhanced cognitive querying. The research intention is to be able to perform queries which are non-trivial to perform using the SQL dialect alone. We tokenize the IMDB 5000 movie data-set to generate embeddings using word2vec and a modified version of doc2vec that we term as row2vec. We discuss the effects of various hyperparameter choices and tokenization techniques. We visualise these embedding using PCA and present the results for certain queries. Keywords: Word embedding, databases, word2vec, cognitive querying.
dc.genrethesis
dc.identifierdoi:10.13016/m2wzew-tsoz
dc.identifier.other11692
dc.identifier.urihttp://hdl.handle.net/11603/15494
dc.languageen
dc.relation.isAvailableAtThe University of Maryland, Baltimore County (UMBC)
dc.relation.ispartofUMBC Computer Science and Electrical Engineering Department Collection
dc.relation.ispartofUMBC Theses and Dissertations Collection
dc.relation.ispartofUMBC Graduate School Collection
dc.relation.ispartofUMBC Student Collection
dc.rightsThis item may be protected under Title 17 of the U.S. Copyright Law. It is made available by UMBC for non-commercial research and education. For permission to publish or reproduce, please see http://aok.lib.umbc.edu/specoll/repro.php or contact Special Collections at speccoll(at)umbc.edu
dc.sourceOriginal File Name: Athley_umbc_0434M_11692.pdf
dc.subjectcognitive querying
dc.subjectdatabases
dc.subjectword2vec
dc.subjectWord embedding
dc.titleCognitive Intelligence in Relational Databases
dc.typeText


Files in this item

Thumbnail

This item appears in the following Collection(s)

Show simple item record