Beyond Math: Stories as a Testbed for Memorization-Constrained Reasoning in LLMs

Jiang, Yuxuan; Ferraro, Francis

Beyond Math: Stories as a Testbed for Memorization-Constrained Reasoning in LLMs

dc.contributor.author	Jiang, Yuxuan
dc.contributor.author	Ferraro, Francis
dc.date.accessioned	2026-04-06T18:35:32Z
dc.date.issued	2026-03
dc.description	The 19th Conference of the European Chapter of the Association for Computational Linguistics, Rabat, Morocco, 24-29 March, 2026
dc.description.abstract	Memorization has been shown to greatly inflate Large Language Models' (LLMs) performance on domains such as math and logic, where success should primarily rely on applying generalizable reasoning rules. In many real-world applications, however, memorization is not meant to be eliminated but selectively constrained—for example, in story understanding, where background knowledge must be integrated with narrative context. Drawing on the cognitive science distinction between “verbatim” (exact recall) and “gist” (semantic abstraction) memorization, we propose a two-tier framework for analyzing how LLMs reason under different degrees of memory access. The Inductive (prompt-guided) Setting softly steers models to reason through selective, context-relevant recall, while the Restrictive Setting imposes stronger constraints by limiting verbatim memory access. Evaluating GPT-4o, LLaMA3.3-70B, and DeepSeek V3 on six character-centric story understanding benchmarks, we find up to a 45.2% accuracy drop under the Restrictive Setting, revealing strong dependence on surface recall. By contrast, the Inductive Setting maintains performance, indicating that prompting can align LLMs toward memorization-constrained reasoning.
dc.description.sponsorship	Some experiments were conducted on the UMBC HPCF, supported by the National Science Foundation under Grant No. CNS-1920079. This material is also based on research that is in part supported by DARPA for the SciFy program under agreement number HR00112520301. The U.S. Government is authorized to reproduce and distribute reprints for Governmental purposes notwithstanding any copy right notation thereon. The views and conclusions contained herein are those of the authors and should not be interpreted as necessarily representing the official policies or endorsements, either express or implied, of DARPA or the U.S. Government.
dc.description.uri	https://aclanthology.org/2026.eacl-long.261/
dc.format.extent	18 pages
dc.genre	conference paper and proceedings
dc.identifier	doi:10.13016/m2z0fj-q9f6
dc.identifier.citation	Jiang, Yuxuan, and Francis Ferraro. “Beyond Math: Stories as a Testbed for Memorization-Constrained Reasoning in LLMs.” Proceedings of the 19th Conference of the European Chapter of the Association for Computational Linguistics (Volume 1: Long Papers), March 2026, 5590–607. https://doi.org/10.18653/v1/2026.eacl-long.261.
dc.identifier.uri	https://doi.org/10.18653/v1/2026.eacl-long.261
dc.identifier.uri	http://hdl.handle.net/11603/42399
dc.language.iso	en
dc.publisher	Association for Computational Linguistics
dc.relation.isAvailableAt	The University of Maryland, Baltimore County (UMBC)
dc.relation.ispartof	UMBC Computer Science and Electrical Engineering Department
dc.relation.ispartof	UMBC Student Collection
dc.relation.ispartof	UMBC Faculty Collection
dc.rights	Attribution 4.0 International
dc.rights.uri	https://creativecommons.org/licenses/by/4.0/deed.en
dc.subject	UMBC Interactive Robotics and Language Lab
dc.title	Beyond Math: Stories as a Testbed for Memorization-Constrained Reasoning in LLMs
dc.type	Text
dcterms.creator	https://orcid.org/0009-0007-8488-3056
dcterms.creator	https://orcid.org/0000-0003-2413-9368

Files

Original bundle

Now showing 1 - 1 of 1

Name:: 2026.eacllong.261.pdf
Size:: 663.42 KB
Format:: Adobe Portable Document Format

Download

Collections

UMBC Computer Science and Electrical Engineering Department
UMBC Faculty Collection
UMBC Student Collection