Large Language Model Driven Analysis of General Coordinates Network (GCN) Circulars

Sharma, Vidushi; Agarwala, Ronit; Racusin, Judith L.; Singer, Leo P.; Barna, Tyler; Burns, Eric; Coughlin, Michael W.; Dutko, Dakota; Elliott, Courey; Gupta, Rahul; Mahabal, Ashish; Mukund, Nikhil

Large Language Model Driven Analysis of General Coordinates Network (GCN) Circulars

dc.contributor.author	Sharma, Vidushi
dc.contributor.author	Agarwala, Ronit
dc.contributor.author	Racusin, Judith L.
dc.contributor.author	Singer, Leo P.
dc.contributor.author	Barna, Tyler
dc.contributor.author	Burns, Eric
dc.contributor.author	Coughlin, Michael W.
dc.contributor.author	Dutko, Dakota
dc.contributor.author	Elliott, Courey
dc.contributor.author	Gupta, Rahul
dc.contributor.author	Mahabal, Ashish
dc.contributor.author	Mukund, Nikhil
dc.date.accessioned	2026-01-22T16:19:13Z
dc.date.issued	2025-11-18
dc.description.abstract	The General Coordinates Network (GCN) is NASA's time-domain and multi-messenger alert system. GCN distributes two data products - automated ``Notices,'' and human-generated ``Circulars,'' that report the observations of high-energy and multi-messenger astronomical transients. The flexible and non-structured format of GCN Circulars, comprising of more than 40500 Circulars accumulated over three decades, makes it challenging to manually extract observational information, such as redshift or observed wavebands. In this work, we employ large language models (LLMs) to facilitate the automated parsing of transient reports. We develop a neural topic modeling pipeline with open-source tools for the automatic clustering and summarization of astrophysical topics in the Circulars database. Using neural topic modeling and contrastive fine-tuning, we classify Circulars based on their observation wavebands and messengers. Additionally, we separate gravitational wave (GW) event clusters and their electromagnetic (EM) counterparts from the Circulars database. Finally, using the open-source Mistral model, we implement a system to automatically extract gamma-ray burst (GRB) redshift information from the Circulars archive, without the need for any training. Evaluation against the manually curated Neil Gehrels Swift Observatory GRB table shows that our simple system, with the help of prompt-tuning, output parsing, and retrieval augmented generation (RAG), can achieve an accuracy of 97.2 % for redshift-containing Circulars. Our neural search enhanced RAG pipeline accurately retrieved 96.8 % of redshift circulars from the manually curated database. Our study demonstrates the potential of LLMs, to automate and enhance astronomical text mining, and provides a foundation work for future advances in transient alert analysis.
dc.description.sponsorship	We thank the anonymous referee for useful comments and suggestions on the manuscript. VS was sponsored by support from the National Aeronautics and Space Administration (NASA) through a cooperative agreement with Center for Research and Exploration in Space Science and Technology II (CRESST II). The views and conclusions contained in this document are those of the authors and should not be interpreted as representing the official policies, either expressed or implied, of the National Aeronautics and Space Administration (NASA) or the U.S. Government. The U.S. Government is authorized to reproduce and distribute reprints for Government purposes notwithstanding any copyright notation herein. The GCN team acknowledges support from the NASA’s Internal Scientist Funding Model (ISFM) program. This research has made use of data obtained through the General Coordinate Network (GCN) Service, provided by the NASA Goddard Space Flight Center (GSFC), in support of NASA’s High Energy Astrophysics Programs. The authors would also like to thank Daniela Huppenkothen for the insightful discussions. RG was sponsored by the National Aeronautics and Space Administration (NASA) through a contract with ORAU. M.W.C acknowledges support from the National Science Foundation with grant numbers PHY-2409481, PHY-2308862 and PHY-2117997. NM acknowledges support from the National Science Foundation (NSF) under awards PHY-1764464 and PHY-2309200 to the LIGO Laboratory, under Cooperative Agreement PHY-2019786 (The NSF AI Institute for Artificial Intelligence and Fundamental Interactions, http://iaifi.org/), and from MathWorks, Inc.
dc.description.uri	http://arxiv.org/abs/2511.14858
dc.format.extent	61 pages
dc.genre	journal articles
dc.genre	postprints
dc.identifier	doi:10.13016/m215d4-lyg9
dc.identifier.uri	https://doi.org/10.48550/arXiv.2511.14858
dc.identifier.uri	http://hdl.handle.net/11603/41558
dc.language.iso	en
dc.relation.isAvailableAt	The University of Maryland, Baltimore County (UMBC)
dc.relation.ispartof	UMBC Center for Space Sciences and Technology (CSST) / Center for Research and Exploration in Space Sciences & Technology II (CRSST II)
dc.relation.ispartof	UMBC Faculty Collection
dc.rights	This item is likely protected under Title 17 of the U.S. Copyright Law. Unless on a Creative Commons license, for uses protected by Copyright Law, contact the copyright holder or the author.
dc.subject	Astrophysics - Instrumentation and Methods for Astrophysics
dc.subject	Astrophysics - High Energy Astrophysical Phenomena
dc.title	Large Language Model Driven Analysis of General Coordinates Network (GCN) Circulars
dc.type	Text
dcterms.creator	https://orcid.org/0000-0002-4394-4138

Files

Original bundle

Now showing 1 - 1 of 1

Name:: 2511.14858v1.pdf
Size:: 2.68 MB
Format:: Adobe Portable Document Format

Download

Collections

UMBC Center for Space Sciences and Technology (CSST) / Center for Research and Exploration in Space Sciences & Technology II (CRSST II)
UMBC Faculty Collection