Is Function Similarity Over-Engineered? Building a Benchmark

Saul, Rebecca; Liu, Chang; Fleischmann, Noah; Zak, Richard; Micinski, Kristopher; Raff, Edward; Holt, James

Is Function Similarity Over-Engineered? Building a Benchmark

dc.contributor.author	Saul, Rebecca
dc.contributor.author	Liu, Chang
dc.contributor.author	Fleischmann, Noah
dc.contributor.author	Zak, Richard
dc.contributor.author	Micinski, Kristopher
dc.contributor.author	Raff, Edward
dc.contributor.author	Holt, James
dc.date.accessioned	2024-12-11T17:02:34Z
dc.date.available	2024-12-11T17:02:34Z
dc.date.issued	2024-10-30
dc.description	38th Conference on Neural Information Processing Systems (NeurIPS 2024), Track on Datasets and Benchmarks, Dec 10-Dec 15 2024
dc.description.abstract	Binary analysis is a core component of many critical security tasks, including reverse engineering, malware analysis, and vulnerability detection. Manual analysis is often time-consuming, but identifying commonly-used or previously-seen functions can reduce the time it takes to understand a new file. However, given the complexity of assembly, and the NP-hard nature of determining function equivalence, this task is extremely difficult. Common approaches often use sophisticated disassembly and decompilation tools, graph analysis, and other expensive pre-processing steps to perform function similarity searches over some corpus. In this work, we identify a number of discrepancies between the current research environment and the underlying application need. To remedy this, we build a new benchmark, REFuSE-Bench, for binary function similarity detection consisting of high-quality datasets and tests that better reflect real-world use cases. In doing so, we address issues like data duplication and accurate labeling, experiment with real malware, and perform the first serious evaluation of ML binary function similarity models on Windows data. Our benchmark reveals that a new, simple basline, one which looks at only the raw bytes of a function, and requires no disassembly or other pre-processing, is able to achieve state-of-the-art performance in multiple settings. Our findings challenge conventional assumptions that complex models with highly-engineered features are being used to their full potential, and demonstrate that simpler approaches can provide significant value.
dc.description.uri	http://arxiv.org/abs/2410.22677
dc.format.extent	20 pages
dc.genre	conference papers and proceedings
dc.genre	postprints
dc.identifier	doi:10.13016/m2dxma-enco
dc.identifier.uri	https://doi.org/10.48550/arXiv.2410.22677
dc.identifier.uri	http://hdl.handle.net/11603/37083
dc.language.iso	en_US
dc.relation.isAvailableAt	The University of Maryland, Baltimore County (UMBC)
dc.relation.ispartof	UMBC Faculty Collection
dc.relation.ispartof	UMBC Computer Science and Electrical Engineering Department
dc.relation.ispartof	UMBC Data Science
dc.rights	This item is likely protected under Title 17 of the U.S. Copyright Law. Unless on a Creative Commons license, for uses protected by Copyright Law, contact the copyright holder or the author.
dc.subject	Computer Science - Machine Learning
dc.subject	Computer Science - Cryptography and Security
dc.title	Is Function Similarity Over-Engineered? Building a Benchmark
dc.type	Text
dcterms.creator	https://orcid.org/0000-0002-9900-1972
dcterms.creator	https://orcid.org/0000-0003-4272-2565

Files

Original bundle

Now showing 1 - 1 of 1

Name:: 2410.22677v1.pdf
Size:: 491.98 KB
Format:: Adobe Portable Document Format

Download

Collections

UMBC Faculty Collection
UMBC Computer Science and Electrical Engineering Department
UMBC Data Science