AFFINITY PROPAGATION INITIALISATION BASED PROXIMITY CLUSTERING FOR LABELING

dc.contributor.advisorJoshi, Karuna
dc.contributor.advisorMulwad, Varish
dc.contributor.authorBandi, Adithya
dc.contributor.departmentComputer Science and Electrical Engineering
dc.contributor.programComputer Science
dc.date.accessioned2021-09-01T13:54:38Z
dc.date.available2021-09-01T13:54:38Z
dc.date.issued2020-01-20
dc.description.abstractThe modern state of the art relation extraction systems requires large amounts of labeled data. However, obtaining such vast amounts of labeled data is a costly task and is almost impossible, especially in industrial settings, due to the time constraints of subject matter experts. Techniques like distant supervision have been used to provide noisy annotations, but this requires the availability of a related knowledge base, which is rarely possible. We propose a novel method where we obtain labeled data based on techniques inspired by Active Learning and Clustering. Our approach differs from Active Learning as we operate under weak supervision, where all the instances provided for training are not manually labeled. We adopt a new clustering paradigm where we use Affinity Propagation to identify potential cluster centers and adopt a randomized local optimization to reduce the number of clusters while increasing the similarity among instances in a cluster. This unique combination of randomization and localization in Clustering paves the way for a distinct class of clustering algorithms.
dc.formatapplication:pdf
dc.genretheses
dc.identifierdoi:10.13016/m2wugu-jdjh
dc.identifier.other12213
dc.identifier.urihttp://hdl.handle.net/11603/22736
dc.languageen
dc.relation.isAvailableAtThe University of Maryland, Baltimore County (UMBC)
dc.relation.ispartofUMBC Computer Science and Electrical Engineering Department Collection
dc.relation.ispartofUMBC Theses and Dissertations Collection
dc.relation.ispartofUMBC Graduate School Collection
dc.relation.ispartofUMBC Student Collection
dc.sourceOriginal File Name: Bandi_umbc_0434M_12213.pdf
dc.subjectLabeling
dc.subjectNatural Language Processing
dc.subjectRelation Extraction
dc.titleAFFINITY PROPAGATION INITIALISATION BASED PROXIMITY CLUSTERING FOR LABELING
dc.typeText
dcterms.accessRightsDistribution Rights granted to UMBC by the author.
dcterms.accessRightsThis item may be protected under Title 17 of the U.S. Copyright Law. It is made available by UMBC for non-commercial research and education. For permission to publish or reproduce, please see http://aok.lib.umbc.edu/specoll/repro.php or contact Special Collections at speccoll(at)umbc.edu

Files

Original bundle

Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
Bandi_umbc_0434M_12213.pdf
Size:
4.4 MB
Format:
Adobe Portable Document Format

License bundle

Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
Bandi-Adithya_Open.pdf
Size:
183 KB
Format:
Adobe Portable Document Format
Description: