URLYZER is a website that classifies an URL as benign or malicious, according to the Random Forest classificator and the URL lexical features extracted. This project was presented as Information Systems final paper in Bachelor's degree, at Universidade Federal de Mato Grosso do Sul, Brazil 🇧🇷, 2021.
This folder contains all the files about the website.
Make sure you have Python installed in your computer (in this project the version used was 3.9.5). After following the steps below:
1. Create a folder to this project and put the urlyzer folder inside it.
- Install the python virtual environment in your computer and create one, after that you can active the virtual environment. You can see more about it here.
3. With the virtual environment activated and in the right path of the urlyzer folder, install all the requirements used in this project.
pip install -r requirements.txt
4. Modify the settings.py inside urlyzer folder putting the host(s) you want to run the server. In the line 28 (you can edit this parameter putting the allowed host(s) to run the project):
ALLOWED_HOSTS=['127.0.0.1']
5. Now you can run the django server with the command:
python manage.py runserver --insecure
The parameter --insecure
is needed because of the parameter DEBUG
setted asFalse
.
According to the URL string put in the home page the site analyzes and classifies it as benign or malicious, putting a web page according to the classification. In the respective page you can come back to the start or go ahead and access the website related to the input.
If your choice was to access the website represented by the URL, URLYZER will try to access a website with the exactly URL string given. Then make sure it's in the right form.
In this folder are the files about the python modules used to build the classificator model, some datasets, the main jupyter notebook about the training, test and results process, and the final classificator model used in the project.
Contains the main datasets used to training and test the classificator model.
Contains the main jupyter notebook used in the project.
The python modules used during the project.
The folder containing the final classificator model used in the project.
The technologies used to build the website are in the requirements.txt file. Besides that some others are related: