Understanding and representing the semantics of large structured documents
dc.contributor.author | Rahman, Muhammad Mahbubur | |
dc.date.accessioned | 2018-10-19T13:49:42Z | |
dc.date.available | 2018-10-19T13:49:42Z | |
dc.date.issued | 2018-10-08 | |
dc.description | Proceedings of the 4th Workshop on Semantic Deep Learning (SemDeep-4, ISWC) | en_US |
dc.description.abstract | Understanding large, structured documents like scholarly articles, requests for proposals or business reports is a complex and difficult task. It involves discovering a document's overall purpose and subject(s), understanding the function and meaning of its sections and subsections, and extracting low level entities and facts about them. In this research, we present a deep learning based document ontology to capture the general purpose semantic structure and domain specific semantic concepts from a large number of academic articles and business documents. The ontology is able to describe different functional parts of a document, which can be used to enhance semantic indexing for a better understanding by human beings and machines. We evaluate our models through extensive experiments on datasets of scholarly articles from arXiv and Request for Proposal documents. | en_US |
dc.description.sponsorship | The work was partially supported by National Science Foundation grant 1549697 and a gifts from IBM and Northrop Grumman. | en_US |
dc.description.uri | https://ebiquity.umbc.edu/paper/html/id/830/Understanding-and-representing-the-semantics-of-large-structured-documents | en_US |
dc.format.extent | 12 pages | en_US |
dc.genre | conference paper pre-print | en_US |
dc.identifier | doi:10.13016/M2X05XH0T | |
dc.identifier.uri | http://hdl.handle.net/11603/11614 | |
dc.language.iso | en_US | en_US |
dc.relation.isAvailableAt | The University of Maryland, Baltimore County (UMBC) | |
dc.relation.ispartof | UMBC Computer Science and Electrical Engineering Department Collection | |
dc.relation.ispartof | UMBC Faculty Collection | |
dc.relation.ispartof | UMBC Student Collection | |
dc.rights | This item is likely protected under Title 17 of the U.S. Copyright Law. Unless on a Creative Commons license, for uses protected by Copyright Law, contact the copyright holder or the author. | |
dc.subject | Document Ontology | en_US |
dc.subject | Deep Learning | en_US |
dc.subject | Semantic Annotation | en_US |
dc.subject | natural language processing | en_US |
dc.subject | UMBC Ebiquity Research Group | en_US |
dc.title | Understanding and representing the semantics of large structured documents | en_US |
dc.type | Text | en_US |
Files
License bundle
1 - 1 of 1
No Thumbnail Available
- Name:
- license.txt
- Size:
- 1.68 KB
- Format:
- Item-specific license agreed upon to submission
- Description: