Cognitive Intelligence in Relational Databases


Author/Creator ORCID



Type of Work


Computer Science and Electrical Engineering


Computer Science

Citation of Original Publication


This item may be protected under Title 17 of the U.S. Copyright Law. It is made available by UMBC for non-commercial research and education. For permission to publish or reproduce, please see or contact Special Collections at speccoll(at)
Distribution Rights granted to UMBC by the author.


We evaluate the applicability of distributed language embedding techniques from the domain of natural language processing to relational data. Relational data is typically stored in SQL databases. We apply modern distributed representations of words (Tomas Mikolov 2013c) and paragraph (Quoc V. Le 2014) techniques to this structured data and attempt to unlock the potential of enhanced cognitive querying. The research intention is to be able to perform queries which are non-trivial to perform using the SQL dialect alone. We tokenize the IMDB 5000 movie data-set to generate embeddings using word2vec and a modified version of doc2vec that we term as row2vec. We discuss the effects of various hyperparameter choices and tokenization techniques. We visualise these embedding using PCA and present the results for certain queries. Keywords: Word embedding, databases, word2vec, cognitive querying.