DRP: Distilled Reasoning Pruning with Skill-aware Step Decomposition for Efficient Large Reasoning Models

dc.contributor.authorJiang, Yuxuan
dc.contributor.authorLi, Dawei
dc.contributor.authorFerraro, Francis
dc.date.accessioned2025-06-17T14:45:32Z
dc.date.available2025-06-17T14:45:32Z
dc.date.issued2025-05-20
dc.description.abstractWhile Large Reasoning Models (LRMs) have demonstrated success in complex reasoning tasks through long chain-of-thought (CoT) reasoning, their inference often involves excessively verbose reasoning traces, resulting in substantial inefficiency. To address this, we propose Distilled Reasoning Pruning (DRP), a hybrid framework that combines inference-time pruning with tuning-based distillation, two widely used strategies for efficient reasoning. DRP uses a teacher model to perform skill-aware step decomposition and content pruning, and then distills the pruned reasoning paths into a student model, enabling it to reason both efficiently and accurately. Across several challenging mathematical reasoning datasets, we find that models trained with DRP achieve substantial improvements in token efficiency without sacrificing accuracy. Specifically, DRP reduces average token usage on GSM8K from 917 to 328 while improving accuracy from 91.7% to 94.1%, and achieves a 43% token reduction on AIME with no performance drop. Further analysis shows that aligning the reasoning structure of training CoTs with the student's reasoning capacity is critical for effective knowledge transfer and performance gains.
dc.description.urihttp://arxiv.org/abs/2505.13975
dc.format.extent14 pages
dc.genrejournal articles
dc.genrepreprints
dc.identifierdoi:10.13016/m2g343-km2p
dc.identifier.urihttps://doi.org/10.48550/arXiv.2505.13975
dc.identifier.urihttp://hdl.handle.net/11603/38907
dc.language.isoen_US
dc.relation.isAvailableAtThe University of Maryland, Baltimore County (UMBC)
dc.relation.ispartofUMBC Faculty Collection
dc.relation.ispartofUMBC Computer Science and Electrical Engineering Department
dc.relation.ispartofUMBC Student Collection
dc.rightsAttribution 4.0 International
dc.rights.urihttps://creativecommons.org/licenses/by/4.0/
dc.subjectComputer Science - Computation and Language
dc.subjectUMBC Interactive Robotics and Language Lab
dc.titleDRP: Distilled Reasoning Pruning with Skill-aware Step Decomposition for Efficient Large Reasoning Models
dc.typeText
dcterms.creatorhttps://orcid.org/0009-0007-8488-3056
dcterms.creatorhttps://orcid.org/0000-0003-2413-9368

Files

Original bundle

Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
2505.13975v2.pdf
Size:
2.17 MB
Format:
Adobe Portable Document Format