A Quantum Algorithm To Locate Unknown Hashes For Known N-Grams Within A Large Malware Corpus

Author/Creator ORCID

Date

2020-05-07

Department

Program

Citation of Original Publication

Nicholas R. Allgood and Charles K. Nicholas, A Quantum Algorithm To Locate Unknown Hashes For Known N-Grams Within A Large Malware Corpus, https://arxiv.org/abs/2005.02911

Rights

This item is likely protected under Title 17 of the U.S. Copyright Law. Unless on a Creative Commons license, for uses protected by Copyright Law, contact the copyright holder or the author.

Subjects

Abstract

Quantum computing has evolved quickly in recent years and is showing significant benefits in a variety of fields. Malware analysis is one of those fields that could also take advantage of quantum computing. The combination of software used to locate the most frequent hashes and n-grams between benign and malicious software (KiloGram) and a quantum search algorithm could be beneficial, by loading the table of hashes and n-grams into a quantum computer, and thereby speeding up the process of mapping n-grams to their hashes. The first phase will be to use KiloGram to find the top-k hashes and n-grams for a large malware corpus. From here, the resulting hash table is then loaded into a quantum machine. A quantum search algorithm is then used search among every permutation of the entangled key and value pairs to find the desired hash value. This prevents one from having to re-compute hashes for a set of n-grams, which can take on average O(MN) time, whereas the quantum algorithm could take O(√N) in the number of table lookups to find the desired hash values.