Hybrid Machine Learning-Based Phishing Detection System Using URL Analysis
DOI:
https://doi.org/10.64751/Abstract
Currently, various forms of cybercrime are organized through the internet, and
this study primarily focuses on phishing attacks. Although phishing originated in 1996, it has
evolved into one of the most severe and dangerous online threats. Phishing uses email
manipulation and fake websites to deceive victims and obtain sensitive data. Many studies have
addressed prevention and detection, yet no complete and effective solution exists. Therefore,
machine learning plays a crucial role in defending against phishing-based cybercrimes. This
work uses a phishing URL-based dataset containing over 11,000 legitimate and phishing URLs
in vector format. After preprocessing, multiple ML algorithms were applied, including Decision
tree, NB, SVM and Xgboost, in that Xgboost is proposed to enhance performance. The canopy
feature selection technique, cross-fold validation, and Grid Search-based hyperparameter
optimization were employed for improved accuracy. Performance metrics such as accuracy,
precision, recall, F1-score, and specificity were used for evaluation. Comparative results show
that the proposed Xg boost model significantly outperforms existing models, providing higher
accuracy and efficiency in phishing URL detection.
Downloads
Published
Issue
Section
License

This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License.







