AFFINITY PROPAGATION INITIALISATION BASED PROXIMITY CLUSTERING FOR LABELING
Links to Files
Permanent Link
Author/Creator
Author/Creator ORCID
Date
Type of Work
Department
Computer Science and Electrical Engineering
Program
Computer Science
Citation of Original Publication
Rights
Distribution Rights granted to UMBC by the author.
This item may be protected under Title 17 of the U.S. Copyright Law. It is made available by UMBC for non-commercial research and education. For permission to publish or reproduce, please see http://aok.lib.umbc.edu/specoll/repro.php or contact Special Collections at speccoll(at)umbc.edu
This item may be protected under Title 17 of the U.S. Copyright Law. It is made available by UMBC for non-commercial research and education. For permission to publish or reproduce, please see http://aok.lib.umbc.edu/specoll/repro.php or contact Special Collections at speccoll(at)umbc.edu
Abstract
The modern state of the art relation extraction systems requires large amounts of labeled data. However, obtaining such vast amounts of labeled data is a costly task and is almost impossible, especially in industrial settings, due to the time constraints of subject matter experts. Techniques like distant supervision have been used to provide noisy annotations, but this requires the availability of a related knowledge base, which is rarely possible. We propose a novel method where we obtain labeled data based on techniques inspired by Active Learning and Clustering. Our approach differs from Active Learning as we operate under weak supervision, where all the instances provided for training are not manually labeled. We adopt a new clustering paradigm where we use Affinity Propagation to identify potential cluster centers and adopt a randomized local optimization to reduce the number of clusters while increasing the similarity among instances in a cluster. This unique combination of randomization and localization in Clustering paves the way for a distinct class of clustering algorithms.
