Automatic Yara Rule Generation Using Biclustering

dc.contributor.authorRaff, Edward
dc.contributor.authorZak, Richard
dc.contributor.authorMunoz, Gary Lopez
dc.contributor.authorFleming, William
dc.contributor.authorAnderson, Hyrum S.
dc.contributor.authorFilar, Bobby
dc.contributor.authorNicholas, Charles
dc.contributor.authorHolt, James
dc.date.accessioned2020-11-02T19:57:59Z
dc.date.available2020-11-02T19:57:59Z
dc.date.issued2020-09-06
dc.description13th ACM Workshop on Artificial Intelligence and Security (AISec)
dc.description.abstractYara rules are a ubiquitous tool among cybersecurity practitioners and analysts. Developing high-quality Yara rules to detect a malware family of interest can be labor- and time-intensive, even for expert users. Few tools exist and relatively little work has been done on how to automate the generation of Yara rules for specific families. In this paper, we leverage large n-grams (n≥8) combined with a new biclustering algorithm to construct simple Yara rules more effectively than currently available software. Our method, AutoYara, is fast, allowing for deployment on low-resource equipment for teams that deploy to remote networks. Our results demonstrate that AutoYara can help reduce analyst workload by producing rules with useful true-positive rates while maintaining low false-positive rates, sometimes matching or even outperforming human analysts. In addition, real-world testing by malware analysts indicates AutoYara could reduce analyst time spent constructing Yara rules by 44-86%, allowing them to spend their time on the more advanced malware that current tools can't handle.en_US
dc.description.urihttps://arxiv.org/abs/2009.03779en_US
dc.format.extent12 pagesen_US
dc.genreconference papers and proceedings preprintsen_US
dc.identifierdoi:10.13016/m2qtkw-vl04
dc.identifier.citationEdward Raff, Richard Zak, Gary Lopez Munoz, William Fleming, Hyrum S. Anderson, Bobby Filar, Charles Nicholas and James Holt, Automatic Yara Rule Generation Using Biclustering, https://arxiv.org/abs/2009.03779en_US
dc.identifier.urihttp://hdl.handle.net/11603/19996
dc.language.isoen_USen_US
dc.publisherACM
dc.relation.isAvailableAtThe University of Maryland, Baltimore County (UMBC)
dc.relation.ispartofUMBC Computer Science and Electrical Engineering Department Collection
dc.relation.ispartofUMBC Faculty Collection
dc.rightsThis item is likely protected under Title 17 of the U.S. Copyright Law. Unless on a Creative Commons license, for uses protected by Copyright Law, contact the copyright holder or the author.
dc.rightsPublic Domain Mark 1.0*
dc.rightsThis work was written as part of one of the author's official duties as an Employee of the United States Government and is therefore a work of the United States Government. In accordance with 17 U.S.C. 105, no copyright protection is available for such works under U.S. Law.
dc.rights.urihttp://creativecommons.org/publicdomain/mark/1.0/*
dc.titleAutomatic Yara Rule Generation Using Biclusteringen_US
dc.typeTexten_US

Files

Original bundle
Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
2009.03779.pdf
Size:
1.47 MB
Format:
Adobe Portable Document Format
Description:
License bundle
Now showing 1 - 1 of 1
No Thumbnail Available
Name:
license.txt
Size:
2.56 KB
Format:
Item-specific license agreed upon to submission
Description: