QPRL: Learning Optimal Policies with Quasi-Potential Functions for Asymmetric Traversal

dc.contributor.authorHossain, Jumman
dc.contributor.authorRoy, Nirmalya
dc.date.accessioned2025-07-30T19:22:17Z
dc.date.issued2025-07-15
dc.description42nd International Conference on Machine Learning (ICML) Vancouver, Canada, July 13 - 19, 2025
dc.description.abstractReinforcement learning (RL) in real-world tasks such as robotic navigation often encounters environments with asymmetric traversal costs, where actions like climbing uphill versus moving downhill incur distinctly different penalties, or transitions may become irreversible. While recent quasimetric RL methods relax symmetry assumptions, they typically do not explicitly account for path-dependent costs or provide rigorous safety guarantees. We introduce Quasi-Potential Reinforcement Learning (QPRL), a novel framework that explicitly decomposes asymmetric traversal costs into a path-independent potential function (?) and a path-dependent residual (?). This decomposition allows efficient learning and stable policy optimization via a Lyapunov-based safety mechanism. Theoretically, we prove that QPRL achieves convergence with improved sample complexity of O˜( ? T), surpassing prior quasimetric RL bounds of O˜(T). Empirically, our experiments demonstrate that QPRL attains state-of-the art performance across various navigation and control tasks, significantly reducing irreversible constraint violations by approximately 4× compared to baselines.
dc.description.sponsorshipThis work has been partially supported by ONR Grant #N00014-23-1-2119, U.S. Army Grant #W911NF2120076, U.S. Army Grant #W911NF2410367, NSF REU Site Grant #2050999, NSF CNS EAGER Grant #2233879, and NSF CAREER Award #1750936.
dc.description.urihttps://openreview.net/pdf?id=eU8vAuMlpH
dc.format.extent18 pages
dc.genreconference papers and proceedings
dc.identifierdoi:10.13016/m2h9jp-lro6
dc.identifier.citationHossain, Jumman, and Nirmalya Roy. “QPRL: Learning Optimal Policies with Quasi-Potential Functions for Asymmetric Traversal,” May 1, 2025. https://openreview.net/pdf?id=eU8vAuMlpH.
dc.identifier.urihttp://hdl.handle.net/11603/39522
dc.language.isoen_US
dc.publisherICML 2025
dc.relation.isAvailableAtThe University of Maryland, Baltimore County (UMBC)
dc.relation.ispartofUMBC Information Systems Department
dc.relation.ispartofUMBC Faculty Collection
dc.relation.ispartofUMBC Student Collection
dc.rightsAttribution 4.0 International
dc.rights.urihttps://creativecommons.org/licenses/by/4.0/
dc.subjectUMBC Mobile, Pervasive and Sensor Computing Lab (MPSC Lab)
dc.titleQPRL: Learning Optimal Policies with Quasi-Potential Functions for Asymmetric Traversal
dc.typeText
dcterms.creatorhttps://orcid.org/0009-0009-4461-7604

Files

Original bundle

Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
4981_QPRL_Learning_Optimal_Pol.pdf
Size:
1.85 MB
Format:
Adobe Portable Document Format