4th European International Conference on Industrial Engineering and Operations Management

Malicious URL Classification Using Extracted Features, Feature Selection Algorithm, and Machine Learning Techniques

JOEL DE GOMA, JO SIMON AMBATA, JOSE GAURANA & DAN JACINTO
Publisher: IEOM Society International
0 Paper Citations
1 Views
1 Downloads
Track: Undergraduate Student Paper Competition
Abstract

Websites have different purposes. Some of which intend legitimate functions in the economy while some of which intend harmful cases towards users. Although various research has been made to address this problem, these detection systems still leave plenty of room for improvement, specifically on its performances. This study was based on the recommended approach for future work by Cuzzocrea (2018) wherein it contains 10 base features of a URL for its classification. The recommended approach states that an extended number of features from the base features increases the detection accuracy. In this paper, it proposes a comparison between the performance of three cases: the proposed features from Cuzzocrea (2018), an extended feature set, and a set where a feature selection algorithm is applied. The researchers utilized machine learning algorithms to build models in classifying legitimate and malicious URLs. The study showed that there is a directly proportional relationship with a model’s number of features and a model’s performance. Extending the number of features of the data set lead to an increase with the performance of each model.

Published in: 4th European International Conference on Industrial Engineering and Operations Management, Rome, Italy

Publisher: IEOM Society International
Date of Conference: August 2-5, 2021

ISBN: 978-1-7923-6127-2
ISSN/E-ISSN: 2169-8767