학술논문

Malicious Short URLs Detection Technique
Document Type
Conference
Source
2023 22nd RoEduNet Conference: Networking in Education and Research (RoEduNet) Networking in Education and Research (RoEduNet), 2023 22nd RoEduNet Conference:. :1-6 Sep, 2023
Subject
Communication, Networking and Broadcast Technologies
Computing and Processing
Uniform resource locators
Training
Machine learning algorithms
Computer viruses
Games
Approximation algorithms
Classification algorithms
short URLs
malicious URLs
threat intelligence
machine learning
Language
ISSN
2247-5443
Abstract
The struggle between attackers and defenders is similar to a cat-and-mouse game. The former always try to find new ways to evade detections and remain undiscovered. One of the easiest tricks they employ to compromise victims is the use of malicious URLs that once accessed can lead to malware download or information steal. To hide the features of these URLs so that it becomes impossible to recognize them as being suspicious, attackers use URL shortening services. In this paper, we propose an exhaustive system to detect malicious short URLs by leveraging threat intelligence data from popular platforms like VirusTotal, and PhishTank, and by employing various Machine Learning (ML) algorithms. Our system works for every URL, no matter the shortening service used either public or custom. Moreover, concerning the ML classifier, we took a publicly available balanced dataset for training, improved its feature set, and obtained an accuracy of approximately 97%. The dataset contains 90 features that belong to three categories: the URL’s lexical properties, external specifications, and website content. We have tested our proposed ML model against JRip, PART, J48, and Random Forest algorithms, the last one being the most accurate. To showcase the effectiveness of our solution, we have implemented a system from scratch that detects malicious short URLs and notify the user concerning their legitimacy.