Towards Semantic Exploration of Tables in Scientific Documents

Date

2023-05-28

Department

Program

Citation of Original Publication

Mulwad, Varish, et al. "Towards Semantic Exploration of Tables in Scientific Documents" Proceedings of the ESWC 2023 Workshops and Tutorials; 1 st International Workshop on SemTech4STLD, May 28, 2023, Hersonissos, Greece (28 May, 2023). https://ceur-ws.org/Vol-3443/ESWC_2023_SemTech4STLD_paper_2.pdf

Rights

This item is likely protected under Title 17 of the U.S. Copyright Law. Unless on a Creative Commons license, for uses protected by Copyright Law, contact the copyright holder or the author.
Attribution 4.0 International (CC BY 4.0)

Abstract

Structured data artifacts such as tables are widely used in scientific literature to organize and concisely communicate important statistical information. Discovering relevant information in these tables remains a significant challenge owing to their structural heterogeneity, dense and often implicit semantics, and diffuse context. This paper describes how we leverage semantic technologies to enable technical experts to search and explore tabular data embedded within scientific documents. We present a system for the on-demand construction of knowledge graphs representing scientific tables (drawn from online scholarly articles hosted by PubMed Central), and for synthesizing tabular responses to semantic search requests against such graphs. We discuss key differentiators in our overall approach, including a two-stage semantic table interpretation that relies on an extensive structural and syntactic characterization of scientific tables, and a prototype knowledge discovery engine that uses automatically-inferred semantics of scientific tables to serve search requests by potentially fusing information from multiple tables on the fly. We evaluate our system on a real-world dataset of approximately 120,000 tables extracted from over 62,000 COVID-19-related scientific articles.