Browsing by Subject "Malware Classification"
Now showing 1 - 2 of 2
Results Per Page
Sort Options
Item Evaluating Automatic Malware Classifiers in the Absence of Reference Labels(2020-01-01) Joyce, Robert j; Nicholas, Charles; Computer Science and Electrical Engineering; Computer ScienceThe malware analysis community is completely devoid of a diverse, up to date reference dataset with ground truth labels. Consequentially, it is typical for automatic malware classifiers to be evaluated using custom datasets with near ground truth labels. However, classifier evaluation using near ground truth labels can yield erroneous or biased results. We propose an alternative classifier evaluation framework that does not require reference labels. We introduce the concept of a ground truth refinement and propose potential methods for constructing an approximation of one from a malware dataset. We prove that using a ground truth refinement it is possible to compute lower bounds on precision and error rate as well as upper bounds on recall and accuracy without requiring ground truth reference labels. We perform a case study on the popular AVClass malware labeler using our proposed evaluation framework.Item Evaluating Machine Learning based Malware Classifiers(2020-01-01) Gurram, Akash Reddy; Nicholas, Charles; Computer Science and Electrical Engineering; Computer ScienceIn recent years, there has been a significant growth in the number of new malware specimens. This resulted in novel Malware Classifiers to help identify them. Many of these Malware Classifiers claim that they use some form of Machine Learning techniques to identify malware. Our task is to evaluate such claims of MLMC's as to what extent they are true and find out if MLMC's are good at identifying new malware that has never been seen before. We have explored the idea of including diversity into the malware specimen which are generated or compiled from the source code collected from different resources like theZoo and the malsource dataset. Different transformations are performed on the source code. Experiments were done with 1-2 compiled malware specimen, applying transformations on source code level and the VirusTotal's response to these transformations.