Towards Learning A Better Text Encoder

dc.contributor.advisorOates, Tim
dc.contributor.authorGao, Hang
dc.contributor.departmentComputer Science and Electrical Engineering
dc.contributor.programComputer Science
dc.date.accessioned2022-02-09T15:52:32Z
dc.date.available2022-02-09T15:52:32Z
dc.date.issued2020-01-01
dc.description.abstractEncoding raw text into a machine comprehensible representation while preserving useful information has long been a popular research area. Despite their great success, traditional supervised text encoders demand human annotation which is often expensive, inefficient and sometimes infeasible to obtain. As a result, they are usually limited in both performance and generalization, especially for deep learning based approaches, which are widely known to usually perform better with a large quantity of data. Recent work has shown that language models pre-trained on large scale corpora can be used as the basis for many downstream tasks and significantly improve the performance of their correspondingly fine-tuned models. However, deep neural networks are still considered as black-boxes and thus often lack interpretability. Besides, deep learning approaches are often vulnerable to adversarial perturbations of the input. These perturbations are usually imperceptible to human eyes or do not change the semantic meaning of the input. On the other hand, despite the utility of transfer learning from pre-trained language models, many downstream tasks still require a relatively large amount of labeled data to achieve expected performance, which is often expensive or impractical in real world. Thus, self-supervised methods are proposed to address the problem, many of which rely on various data augmentation techniques. In this thesis, we propose to learn better text encoders from the following three directions: (1) we design a neural network with a better architecture capable of approximating a larger set of functions; (2) we propose several algorithms to generate adversarial examples for text encoders and fine-tune the models on these samples to improve their robustness; (3) we introduce a new data augmentation algorithm to enlarge the corpus when labeled data is limited. The proposed methods were evaluated on various downstream tasks against many baseline models and algorithms, including language modeling, sentiment classification, and semantic relatedness prediction. Some of them were also evaluated on their time efficiency. The experimental results show that: (1) our proposed neural network architecture can improve the model's performance on various downstream NLP tasks without significantly increasing the number of parameters required; (2) models adversarially trained with the proposed algorithm can preserve relatively the same performance on adversarial examples as on natural examples; and (3) the artificial data generated with the proposed data augmentation technique can significantly improve the models' performance with very limited labeled training data.
dc.formatapplication:pdf
dc.genredissertations
dc.identifierdoi:10.13016/m2j9k4-in2b
dc.identifier.other12356
dc.identifier.urihttp://hdl.handle.net/11603/24175
dc.languageen
dc.relation.isAvailableAtThe University of Maryland, Baltimore County (UMBC)
dc.relation.ispartofUMBC Computer Science and Electrical Engineering Department Collection
dc.relation.ispartofUMBC Theses and Dissertations Collection
dc.relation.ispartofUMBC Graduate School Collection
dc.relation.ispartofUMBC Student Collection
dc.sourceOriginal File Name: Gao_umbc_0434D_12356.pdf
dc.subjectadversarial training
dc.subjectdata augmentation
dc.subjectlong short term memory
dc.subjectnatural language processing
dc.titleTowards Learning A Better Text Encoder
dc.typeText
dcterms.accessRightsAccess limited to the UMBC community. Item may possibly be obtained via Interlibrary Loan through a local library, pending author/copyright holder's permission.
dcterms.accessRightsThis item may be protected under Title 17 of the U.S. Copyright Law. It is made available by UMBC for non-commercial research and education. For permission to publish or reproduce, please see http://aok.lib.umbc.edu/specoll/repro.php or contact Special Collections at speccoll(at)umbc.edu

Files

Original bundle
Now showing 1 - 1 of 1
No Thumbnail Available
Name:
Gao_umbc_0434D_12356.pdf
Size:
2.28 MB
Format:
Adobe Portable Document Format