SubstratumGraphEnv: Reinforcement Learning Environment (RLE) for Modeling System Attack Paths

Adewunmi, Bahirah; Raff, Edward; Purushotham, Sanjay

SubstratumGraphEnv: Reinforcement Learning Environment (RLE) for Modeling System Attack Paths

dc.contributor.author	Adewunmi, Bahirah
dc.contributor.author	Raff, Edward
dc.contributor.author	Purushotham, Sanjay
dc.date.accessioned	2026-03-26T14:26:47Z
dc.date.issued	2026-03-02
dc.description	AI for Cyber Security Workshop at AAAI-26, Singapore, January 20 – January 27, 2026
dc.description.abstract	Automating network security analysis, particularly the identification of potential attack paths, presents significant challenges. Due in part to the sequential, interconnected, and evolutionary nature of system events which most artificial intelligence (AI) techniques struggle to model effectively. This paper proposes a Reinforcement Learning (RL) environment generation framework that simulates the sequence of processes executed on a Windows operating system, enabling dynamic modeling of malicious processes on a system. This methodology models operating system state and transitions using a graph representation. This graph is derived from open-source System Monitor (Sysmon) logs. To address the variety in system event types, fields, and log formats, a mechanism was developed to capture and model parent-child processes from Sysmon logs. A Gymnasium environment (SubstratumGraphEnv) was constructed to establish the perceptible basis for an RL environment, and a customized PyTorch interface was also built (SubstratumBridge) to translate Gymnasium graphs into Deep Reinforcement Learning (DRL) observations and discrete actions. Graph Convolutional Networks (GCNs) concretize the graph's local and global state, which feed the distinct policy and critic heads of an Advantage Actor-Critic (A2C) model. This work's central contribution lies in the design of a novel deep graphical RL environment that automates translation of sequential user and system events, furnishing crucial context for cybersecurity analysis. This work provides a foundation for future research into shaping training parameters and advanced reward shaping, while also offering insight into which system events attributes are critical to training autonomous RL agents.
dc.description.uri	http://arxiv.org/abs/2603.01340
dc.format.extent	10 pages
dc.genre	conference papers and proceedings
dc.genre	preprints
dc.identifier	doi:10.13016/m2qybr-6fhc
dc.identifier.uri	https://doi.org/10.48550/arXiv.2603.01340
dc.identifier.uri	http://hdl.handle.net/11603/42272
dc.language.iso	en
dc.relation.isAvailableAt	The University of Maryland, Baltimore County (UMBC)
dc.relation.ispartof	UMBC Information Systems Department
dc.relation.ispartof	UMBC Faculty Collection
dc.relation.ispartof	UMBC Student Collection
dc.rights	Attribution-NonCommercial-NoDerivatives 4.0 International
dc.rights.uri	https://creativecommons.org/licenses/by-nc-nd/4.0/deed.en
dc.subject	Computer Science - Cryptography and Security
dc.subject	Computer Science - Artificial Intelligence
dc.subject	Computer Science - Machine Learning
dc.title	SubstratumGraphEnv: Reinforcement Learning Environment (RLE) for Modeling System Attack Paths
dc.type	Text
dcterms.creator	https://orcid.org/0009-0008-9248-4167

Files

Original bundle

Now showing 1 - 1 of 1

Name:: 2603.01340v1.pdf
Size:: 863.39 KB
Format:: Adobe Portable Document Format

Download

Collections

UMBC Information Systems Department
UMBC Faculty Collection
UMBC Student Collection