Cognitive Intelligence in Relational Databases

dc.contributor.advisorOates, Tim
dc.contributor.authorAthley, Sushant
dc.contributor.departmentComputer Science and Electrical Engineering
dc.contributor.programComputer Science
dc.date.accessioned2019-10-11T13:42:57Z
dc.date.available2019-10-11T13:42:57Z
dc.date.issued2017-01-01
dc.description.abstractWe evaluate the applicability of distributed language embedding techniques from the domain of natural language processing to relational data. Relational data is typically stored in SQL databases. We apply modern distributed representations of words (Tomas Mikolov 2013c) and paragraph (Quoc V. Le 2014) techniques to this structured data and attempt to unlock the potential of enhanced cognitive querying. The research intention is to be able to perform queries which are non-trivial to perform using the SQL dialect alone. We tokenize the IMDB 5000 movie data-set to generate embeddings using word2vec and a modified version of doc2vec that we term as row2vec. We discuss the effects of various hyperparameter choices and tokenization techniques. We visualise these embedding using PCA and present the results for certain queries. Keywords: Word embedding, databases, word2vec, cognitive querying.
dc.genretheses
dc.identifierdoi:10.13016/m2wzew-tsoz
dc.identifier.other11692
dc.identifier.urihttp://hdl.handle.net/11603/15494
dc.languageen
dc.relation.isAvailableAtThe University of Maryland, Baltimore County (UMBC)
dc.relation.ispartofUMBC Computer Science and Electrical Engineering Department Collection
dc.relation.ispartofUMBC Theses and Dissertations Collection
dc.relation.ispartofUMBC Graduate School Collection
dc.relation.ispartofUMBC Student Collection
dc.rightsThis item may be protected under Title 17 of the U.S. Copyright Law. It is made available by UMBC for non-commercial research and education. For permission to publish or reproduce, please see http://aok.lib.umbc.edu/specoll/repro.php or contact Special Collections at speccoll(at)umbc.edu
dc.sourceOriginal File Name: Athley_umbc_0434M_11692.pdf
dc.subjectcognitive querying
dc.subjectdatabases
dc.subjectword2vec
dc.subjectWord embedding
dc.titleCognitive Intelligence in Relational Databases
dc.typeText
dcterms.accessRightsDistribution Rights granted to UMBC by the author.

Files

Original bundle
Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
Athley_umbc_0434M_11692.pdf
Size:
3.26 MB
Format:
Adobe Portable Document Format
License bundle
Now showing 1 - 1 of 1
No Thumbnail Available
Name:
AthleyS_Cognitive_Open.pdf
Size:
44.2 KB
Format:
Adobe Portable Document Format
Description: