Dynamic Topic Modeling to Infer the Influence of Research Citations on IPCC Assessment Reports

Author/Creator ORCID

Date

2016-12-05

Department

Program

Citation of Original Publication

Jennifer Sleeman, Milton Halem, Tim Finin, Mark Cane, Dynamic Topic Modeling to Infer the Influence of Research Citations on IPCC Assessment Reports, December 5, 2016, https://ebiquity.umbc.edu/paper/html/id/768/Dynamic-Topic-Modeling-to-Infer-the-Influence-of-Research-Citations-on-IPCC-Assessment-Reports

Rights

This item is likely protected under Title 17 of the U.S. Copyright Law. Unless on a Creative Commons license, for uses protected by Copyright Law, contact the copyright holder or the author.
© 2016 IEEE

Abstract

A common Big Data problem is the need to integrate large temporal data sets from various data sources into one comprehensive structure. Having the ability to correlate evolving facts between data sources can be especially useful in supporting a number of desired application functions such as inference and influence identification. As a real world application we use climate change publications based on the Intergovernmental Panel on Climate Change, which publishes climate change assessment reports every five years, with currently over 25 years of published content. Often these reports reference thousands of research papers. We use dynamic topic modeling as a basis for combining report and citation domains into one structure. We are able to correlate documents between the two domains to understand how the research has influenced the reports and how this influence has changed over time. In this use case, the topic report model used a total number of 410 documents and 5911 terms in the vocabulary while in the topic citations the vocabulary consisted of 25,154 terms and the number of documents was closer to 200,000 research papers.