A Comprehensive Machine Learning Approach for Email and URL Threat Detection Using Feature Importance Analysis

Author/Creator ORCID

Department

Program

Citation of Original Publication

Kharabsheh, Mohammad, Shadi AlZu’bi, Ali Alsarhan, Ala’a Mughaith, Nadera Aljawabreh, and Mohammad Alabdullatif. “A Comprehensive Machine Learning Approach for Email and URL Threat Detection Using Feature Importance Analysis.” International Journal of Advances in Soft Computing and Its Applications 17, no. 2 (2025). https://doi.org/10.15849/IJASCA.250730.16.

Rights

This item is likely protected under Title 17 of the U.S. Copyright Law. Unless on a Creative Commons license, for uses protected by Copyright Law, contact the copyright holder or the author.

Abstract

Phishing is the most prevalent form of cybercrime, where individuals are convinced to disclose sensitive details like account IDs, passwords, and banking information. These cyberattacks are often initiated through emails, instant messaging, and phone calls. The primary concern today revolves around the security of devices, computers, and software. This study presents the development of a website designed to scan incoming emails and attachments for potential viruses and security threats. This website includes validation attachment scanning, URL scanning, and IP address scanning. Integration with the VirusTotal database will be carried out to assess the safety of websites. Furthermore, the study incorporates machine learning algorithms to enhance phishing detection, ultimately mitigating risks and occurrences. The dataset utilized comprises diverse sources containing both regular and phishing emails, along with numerous attributes for identifying malicious emails and harmful URL links, some of which are sourced from VirusTotal. The outcomes of the experiments reveal promising levels of accuracy in identifying phishing attacks, underscoring the efficiency of machine learning as a vital component in enhancing email security. The study also addresses the obstacles and constraints faced by the proposed models, highlighting the evolving nature of phishing strategies and the necessity for continual model adaptation.