Can LLMs Obfuscate Code? A Systematic Analysis of Large Language Models into Assembly Code Obfuscation

dc.contributor.authorMohseni, Seyedreza
dc.contributor.authorMohammadi, Ali
dc.contributor.authorTilwani, Deepa
dc.contributor.authorSaxena, Yash
dc.contributor.authorNdawula, Gerald
dc.contributor.authorVema, Sriram
dc.contributor.authorRaff, Edward
dc.contributor.authorGaur, Manas
dc.date.accessioned2025-01-31T18:24:11Z
dc.date.available2025-01-31T18:24:11Z
dc.date.issued2024-12-24
dc.description.abstractMalware authors often employ code obfuscations to make their malware harder to detect. Existing tools for generating obfuscated code often require access to the original source code (e.g., C++ or Java), and adding new obfuscations is a non-trivial, labor-intensive process. In this study, we ask the following question: Can Large Language Models (LLMs) potentially generate a new obfuscated assembly code? If so, this poses a risk to anti-virus engines and potentially increases the flexibility of attackers to create new obfuscation patterns. We answer this in the affirmative by developing the MetamorphASM benchmark comprising MetamorphASM Dataset (MAD) along with three code obfuscation techniques: dead code, register substitution, and control flow change. The MetamorphASM systematically evaluates the ability of LLMs to generate and analyze obfuscated code using MAD, which contains 328,200 obfuscated assembly code samples. We release this dataset and analyze the success rate of various LLMs (e.g., GPT-3.5/4, GPT-4o-mini, Starcoder, CodeGemma, CodeLlama, CodeT5, and LLaMA 3.1) in generating obfuscated assembly code. The evaluation was performed using established information-theoretic metrics and manual human review to ensure correctness and provide the foundation for researchers to study and develop remediations to this risk. The source code can be found at the following GitHub link: https://github.com/mohammadi-ali/MetamorphASM.
dc.description.sponsorshipWe acknowledge the support from UMBC Cybersecurity Leadership – Exploratory Grant Program. Any opinions, conclusions, or recommendations expressed in this material are those of the authors and do not necessarily reflect the views of UMBC or Booz Allen Hamilton.
dc.description.urihttp://arxiv.org/abs/2412.16135
dc.format.extent9 pages
dc.genrejournal articles
dc.genrepostprints
dc.identifierdoi:10.13016/m2apdj-1lgs
dc.identifier.urihttps://doi.org/10.48550/arXiv.2412.16135
dc.identifier.urihttp://hdl.handle.net/11603/37563
dc.language.isoen_US
dc.publisherAAAI
dc.relation.isAvailableAtThe University of Maryland, Baltimore County (UMBC)
dc.relation.ispartofUMBC Computer Science and Electrical Engineering Department
dc.relation.ispartofUMBC Student Collection
dc.relation.ispartofUMBC Faculty Collection
dc.rightsAttribution 4.0 International
dc.rights.urihttps://creativecommons.org/licenses/by/4.0/
dc.subjectComputer Science - Artificial Intelligence
dc.subjectComputer Science - Cryptography and Security
dc.subjectComputer Science - Computation and Language
dc.subjectUMBC Ebiquity Research Group
dc.titleCan LLMs Obfuscate Code? A Systematic Analysis of Large Language Models into Assembly Code Obfuscation
dc.typeText
dcterms.creatorhttps://orcid.org/0009-0006-6081-9896
dcterms.creatorhttps://orcid.org/0000-0002-5411-2230
dcterms.creatorhttps://orcid.org/0009-0001-2548-4705

Files

Original bundle

Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
2412.16135v2.pdf
Size:
682.39 KB
Format:
Adobe Portable Document Format