Descriptive Statistics of Malware Data

Author/Creator

Author/Creator ORCID

Date

2024-01-01

Department

Computer Science and Electrical Engineering

Program

Computer Science

Citation of Original Publication

Rights

This item may be protected under Title 17 of the U.S. Copyright Law. It is made available by UMBC for non-commercial research and education. For permission to publish or reproduce, please see http://aok.lib.umbc.edu/specoll/repro.php or contact Special Collections at speccoll(at)umbc.edu
Distribution Rights granted to UMBC by the author.

Abstract

Exploring and analysing a dataset provides insights into what kind of data is present and how it can be used. This is especially useful for malware datasets. As an area that is growing bigger due to the implementation of machine learning techniques, having knowledge about a dataset may assist in any future machine learning task that can be done on the dataset. This work aims to gain statistical insights about a dataset of malware and to explore patterns of different families of malware. This will provide a gateway to enable categorizing malicious files based on their properties. One of the outcomes of this work is the discovery of patterns and insights as to how different attributes of a malware specimen can act as an indicator of its maliciousness.