The Code is written in Python 3.9 If you don't have Python installed you can find it here. If you are using a lower version of Python you can upgrade using the pip package, ensuring you have the latest version of pip. To install the required packages and libraries, run this command in the project directory after cloning the repository:
pip install -r requirements.txt
├── static
│ ├── styles.css
├── templates
│ ├── index.html
├── README.md
├── app.py
├── feature.py
├── phishing.csv
├── requirements.txt
Accuracy of various model used for URL detection
ML Model | Accuracy | f1_score | Recall | Precision | |
---|---|---|---|---|---|
0 | Gradient Boosting Classifier | 0.974 | 0.977 | 0.994 | 0.986 |
1 | CatBoost Classifier | 0.972 | 0.975 | 0.994 | 0.989 |
2 | XGBoost Classifier | 0.969 | 0.973 | 0.993 | 0.984 |
3 | Multi-layer Perceptron | 0.969 | 0.973 | 0.995 | 0.981 |
4 | Random Forest | 0.967 | 0.971 | 0.993 | 0.990 |
5 | Support Vector Machine | 0.964 | 0.968 | 0.980 | 0.965 |
6 | Decision Tree | 0.960 | 0.964 | 0.991 | 0.993 |
7 | K-Nearest Neighbors | 0.956 | 0.961 | 0.991 | 0.989 |
8 | Logistic Regression | 0.934 | 0.941 | 0.943 | 0.927 |
9 | Naive Bayes Classifier | 0.605 | 0.454 | 0.292 | 0.997 |
Feature importance for Malicious URL Detection
Gradient Boosting Classifier currectly classify URL upto 97.4% respective classes and hence reduces the chance of malicious attachments. \The final conclusion on the Malicious dataset is that the some feature like "HTTTPS", "AnchorURL", "WebsiteTraffic" have more importance to classify URL is Malicious URL or not. The final take away form this project is to explore various machine learning models, perform Exploratory Data Analysis on Malicious dataset and understanding their features.