Affinity Propagation Initialisation Based Proximity Clustering For Labeling in Natural Language Based Big Data Systems

dc.contributor.authorBandi, Adithya
dc.contributor.authorJoshi, Karuna
dc.contributor.authorMulwad, Varish
dc.date.accessioned2020-07-21T18:39:16Z
dc.date.available2020-07-21T18:39:16Z
dc.date.issued2020-06-23
dc.description2020 IEEE 6th Intl Conference on Big Data Security on Cloud (BigDataSecurity), IEEE Intl Conference on High Performance and Smart Computing, (HPSC) and IEEE Intl Conference on Intelligent Data and Security (IDS), 25-27 May 2020en_US
dc.description.abstractA key challenge for natural language based large text data is automatically extracting knowledge, in terms of entities and relations, embedded in it. State of the art relation extraction systems requires large amounts of labeled data, which is costly and very difficult, especially in industrial settings, due to time constraints of subject matter experts. Techniques like distant supervision require the availability of a related knowledge base, which is rarely possible. We have developed a novel model for automatically clustering textual Big Data, based on techniques inspired from Active Learning and Clustering, that can derive powerful insights and make the data ready for machine learning with minimal manual effort. Our approach differs from Active Learning as we operate under weak supervision, where all the instances provided for training are not manually labeled. Secondly, This differs from any prevailing clustering algorithms as we adopt a whole new approach of proximity clustering based on affinity propagation. Due to the extrapolation of the labeling efforts, our model makes it easier to adopt deep learning approaches with minimal manual effort. In this paper, we describe our algorithm in detail, along with the experimental results obtained for them.en_US
dc.description.urihttps://ieeexplore.ieee.org/document/9123041en_US
dc.format.extent7 pagesen_US
dc.genreconference papers and proceedings preprintsen_US
dc.identifierdoi:10.13016/m2ypdx-5une
dc.identifier.citationA. Bandi, K. Joshi and V. Mulwad, "Affinity Propagation Initialisation Based Proximity Clustering For Labeling in Natural Language Based Big Data Systems," 2020 IEEE 6th Intl Conference on Big Data Security on Cloud (BigDataSecurity), IEEE Intl Conference on High Performance and Smart Computing, (HPSC) and IEEE Intl Conference on Intelligent Data and Security (IDS), Baltimore, MD, USA, 2020, pp. 1-7, doi: 10.1109/BigDataSecurity-HPSC-IDS49724.2020.00012.en_US
dc.identifier.uri10.1109/BigDataSecurity-HPSC-IDS49724.2020.00012
dc.identifier.urihttp://hdl.handle.net/11603/19209
dc.language.isoen_USen_US
dc.publisherIEEEen_US
dc.relation.isAvailableAtThe University of Maryland, Baltimore County (UMBC)
dc.relation.ispartofUMBC Information Systems Department Collection
dc.relation.ispartofUMBC Faculty Collection
dc.relation.ispartofUMBC Student Collection
dc.rightsThis item is likely protected under Title 17 of the U.S. Copyright Law. Unless on a Creative Commons license, for uses protected by Copyright Law, contact the copyright holder or the author.
dc.rights© 2020 IEEE
dc.subjectUMBC Ebiquity Research Group
dc.titleAffinity Propagation Initialisation Based Proximity Clustering For Labeling in Natural Language Based Big Data Systemsen_US
dc.typeTexten_US

Files

Original bundle

Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
982.pdf
Size:
437.6 KB
Format:
Adobe Portable Document Format
Description:

License bundle

Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
license.txt
Size:
2.56 KB
Format:
Item-specific license agreed upon to submission
Description: