Deep Convolutional Neural Networks for the Classification of the EMBER Malware Dataset

Author/Creator

Author/Creator ORCID

Date

2018-01-01

Department

Computer Science and Electrical Engineering

Program

Computer Science

Citation of Original Publication

Rights

Distribution Rights granted to UMBC by the author.
Access limited to the UMBC community. Item may possibly be obtained via Interlibrary Loan thorugh a local library, pending author/copyright holder's permission.
This item is likely protected under Title 17 of the U.S. Copyright Law. Unless on a Creative Commons license, for uses protected by Copyright Law, contact the copyright holder or the author.

Subjects

Abstract

With the growing number of computer users across the world, security issues are growing exponentially. There is an imbalance in the pace of growing security issues and companies coming up with solutions. In May 2017, more than 400,000 computer systems in Telefonia and UK's National Health System were attacked by WannaCry Malware. Attackers and malware developers are using advanced malware techniques and vulnerabilities in the operating system to gain control over the victim's computer. They are coming up with new techniques and strategies to hide the malicious code and infect the targets. Anti-Virus scanners help to solve the detection of malware to some extent, but they fail to function when a new class of malware is presented. Therefore, we need a method of automating malware detection. So we are trying to apply a machine learning technique called Convolutional Neural Networks (CNNs) to accomplish the goal of automating malware detection. In recent years, applying machine learning to malware data has drawn much attention. In the past, researchers have used CNNs on malware binaries (Nataraj et al. 2011) and malware windows PE files. In this theses, the CNN technique is applied to statistically extracted features from Windows Malware PE files. We use the EMBER labeled benchmark dataset in this work. Results show that our model outperforms the LightGBM and MalConv models