Stratified Neural Models for Document Classification

Mehta, Sarthak Mayur

Stratified Neural Models for Document Classification

dc.contributor.advisor	Ferraro, Francis
dc.contributor.author	Mehta, Sarthak Mayur
dc.contributor.department	Computer Science and Electrical Engineering
dc.contributor.program	Computer Science
dc.date.accessioned	2021-09-01T13:55:13Z
dc.date.available	2021-09-01T13:55:13Z
dc.date.issued	2020-01-20
dc.description.abstract	Document classification is an abstract task in the domain of natural language processing and information retrieval. There are traditional methods associated with this task, our method shows the performance enhancement in terms of the performance, convergence and enrichment of information. We propose a hybrid neural language modelling architecture that constructs hierarchical feature representations. We examine our architecture through document classification. In our first model, we begin with a character level convolutional neural layer (CNN) to get word-level representation, next layers recurrent neural network (RNN) with attention-based feature merging in order to get sentence level representation and again we have RNN with attention layer to get document level representation and finally, we have interconnected dense structure stacked to classify documents with soft-max activation. We extend this model to the word level and summarize the overall results and comparisons with baseline models. We show evidence of the hypotheses on multiple datasets, utilizing IMDB YELP review datasets. We show extended results with all datasets in terms of performance with F1 score, accuracy, precision and recall. Also, we show the comparison of convergence time and the rate of convergence of our approach. Moreover, we show visual evidence that our approach leads to better feature construction and able to construct features for 99% of the effective word vocabulary from the characters in the documents.
dc.format	application:pdf
dc.genre	theses
dc.identifier	doi:10.13016/m2crpv-5qxz
dc.identifier.other	12140
dc.identifier.uri	http://hdl.handle.net/11603/22806
dc.language	en
dc.relation.isAvailableAt	The University of Maryland, Baltimore County (UMBC)
dc.relation.ispartof	UMBC Computer Science and Electrical Engineering Department Collection
dc.relation.ispartof	UMBC Theses and Dissertations Collection
dc.relation.ispartof	UMBC Graduate School Collection
dc.relation.ispartof	UMBC Student Collection
dc.source	Original File Name: Mehta_umbc_0434M_12140.pdf
dc.subject	Machine Learning
dc.subject	Natural Language Processing
dc.title	Stratified Neural Models for Document Classification
dc.type	Text
dcterms.accessRights	Distribution Rights granted to UMBC by the author.
dcterms.accessRights	This item may be protected under Title 17 of the U.S. Copyright Law. It is made available by UMBC for non-commercial research and education. For permission to publish or reproduce, please see http://aok.lib.umbc.edu/specoll/repro.php or contact Special Collections at speccoll(at)umbc.edu

Files

Original bundle

Now showing 1 - 1 of 1

Name:: Mehta_umbc_0434M_12140.pdf
Size:: 1.23 MB
Format:: Adobe Portable Document Format

Download

License bundle

Now showing 1 - 1 of 1

Name:: Mehta-Sarthak_Open.pdf
Size:: 237.8 KB
Format:: Adobe Portable Document Format
Description:

Download

Collections

UMBC Theses and Dissertations
UMBC Computer Science and Electrical Engineering Department
UMBC Graduate School
UMBC Student Collection