QPRL: Learning Optimal Policies with Quasi-Potential Functions for Asymmetric Traversal

Hossain, Jumman, and Nirmalya Roy. “QPRL: Learning Optimal Policies with Quasi-Potential Functions for Asymmetric Traversal,” May 1, 2025. https://openreview.net/pdf?id=eU8vAuMlpH.

Rights

Attribution 4.0 International

Subjects

UMBC Mobile, Pervasive and Sensor Computing Lab (MPSC Lab)

Abstract

Reinforcement learning (RL) in real-world tasks such as robotic navigation often encounters environments with asymmetric traversal costs, where actions like climbing uphill versus moving downhill incur distinctly different penalties, or transitions may become irreversible. While recent quasimetric RL methods relax symmetry assumptions, they typically do not explicitly account for path-dependent costs or provide rigorous safety guarantees. We introduce Quasi-Potential Reinforcement Learning (QPRL), a novel framework that explicitly decomposes asymmetric traversal costs into a path-independent potential function (?) and a path-dependent residual (?). This decomposition allows efficient learning and stable policy optimization via a Lyapunov-based safety mechanism. Theoretically, we prove that QPRL achieves convergence with improved sample complexity of O˜( ? T), surpassing prior quasimetric RL bounds of O˜(T). Empirically, our experiments demonstrate that QPRL attains state-of-the art performance across various navigation and control tasks, significantly reducing irreversible constraint violations by approximately 4× compared to baselines.

QPRL: Learning Optimal Policies with Quasi-Potential Functions for Asymmetric Traversal

Files

Links to Files

Permanent Link

Collections

Author/Creator

Author/Creator ORCID

Date

Type of Work

Department

Program

Citation of Original Publication

Rights

Subjects

Abstract