A QUANTUM ALGORITHM TO LOCATE UNKNOWN HASHES FOR KNOWN N-GRAMS WITHIN A LARGE MALWARE CORPUS

Author/Creator

Author/Creator ORCID

Date

2020-01-01

Department

Computer Science and Electrical Engineering

Program

Computer Science

Citation of Original Publication

Rights

Access limited to the UMBC community. Item may possibly be obtained via Interlibrary Loan thorugh a local library, pending author/copyright holder's permission.
This item may be protected under Title 17 of the U.S. Copyright Law. It is made available by UMBC for non-commercial research and education. For permission to publish or reproduce, please see http://aok.lib.umbc.edu/specoll/repro.php or contact Special Collections at speccoll(at)umbc.edu

Abstract

Quantum computing has evolved quickly in recent years and is showing significant benefits in many fields. Malware analysis is one of those fields that could also take advantage of quantum computing. Combining software used to locate the most frequent hashes and $n$-grams between benign and malicious software (KiloGram)\cite{Kilograms_2019} with a quantum search algorithm, this could prove to have an improvement by being able to load the table of hashes and $n$-grams into a quantum computer to look up an unknown hash for a known $n$-gram. The first phase will be to classically use KiloGram\cite{Kilograms_2019} to find the top-$k$ hashes and $n$-grams for a large malware corpus. The resulting table is loaded into a quantum machine. A quantum search algorithm is used to search among every permutation of the entangled key and value pairs to find the unknown hash. This prevents the re-computation of hashes for a set of $n$-grams which can take on average $O(MN)$ time where the quantum algorithm could take $O(\sqrt{N})$ number of table lookups to find the unknown hash.