Website URL: https://main.d3ic9i6whelr8c.amplifyapp.com/
The Malicious URL Detection System is a comprehensive and powerful platform for detecting and preventing access to malicious websites using machine learning, deep learning, and natural language processing (NLP) techniques.
The system's primary goal is to identify and categorize URLs into safe or malicious, thereby safeguarding users from cyber threats and enhancing overall internet security.
The project's frontend is developed using React, a popular JavaScript library for building user interfaces, while the backend is built using the Flask web framework.
The entire model pipeline, from data ingestion and preprocessing to model building, is implemented in Python, with extensive logging and custom exception handling to ensure optimal performance and maintainability.
The frontend is deployed on AWS Amplify, and the backend is deployed using Azure, offering seamless integration and scalability.
-
Data ingestion involves collecting, importing, and processing raw data from various sources, such as public and private datasets of URLs, web scraping, and real-time data streams. This data is used to train and validate the machine learning and deep learning models.
-
The data preprocessing stage includes data cleaning, feature extraction, and feature engineering to transform the raw data into a format suitable for modeling. This stage involves handling missing or inconsistent data, tokenization, and extraction of relevant features from URLs, such as domain names, subdomains, and URL lengths.
-
The project utilizes machine learning, deep learning, and NLP techniques to build multiple models for malicious URL detection. These models include traditional ML algorithms (e.g., decision trees, SVM), deep learning models (e.g., CNN, RNN), and NLP-based models (e.g., transformers, word embeddings). The models are trained and validated using the preprocessed data and evaluated based on their accuracy, precision, recall, and F1-score.
-
The selected models are integrated into the Flask backend, which serves as an API for the React frontend. The API receives URL inputs from users, processes them using the trained models, and returns the classification results (i.e., safe or malicious) to the frontend.
-
The system incorporates extensive logging and custom exception handling mechanisms to monitor the application's performance, detect issues, and ensure a seamless user experience. These mechanisms provide detailed information on errors, warnings, and system events, enabling developers to troubleshoot and improve the system continuously.
-
The React frontend is deployed using AWS Amplify, a development platform for building and deploying secure and scalable web applications. Amplify provides a range of features, including authentication, storage, and serverless functions, allowing the frontend to be easily and securely hosted in the cloud.
-
The Flask backend is deployed using Azure, Microsoft's cloud computing platform. Azure offers a range of services for hosting, scaling, and managing web applications, ensuring that the backend can handle increasing traffic and user demands.
Frontend: React, ChakraUI, Tsparticles
Server: Flask, Python, Machine Learning, Deep , NLP, Text processing
- React.js
- Node.js
- Python 3
- Flask
- Azure account
- AWS Amplify account
Clone the repository
git clone https://github.com/Priyanshu9898/End-to-End-Malicious-URL-Detection.git
Change to the project's directory
cd End-to-End-Malicious-URL-Detection
Install the frontend dependencies
cd frontend
npm install
Install the Backend dependencies
cd backend
pip install -r requirements.txt
Start the frontend development server
cd frontend
npm start
Start the backend development server
cd backend
python app.py
Open your browser and visit http://localhost:3000 to access the frontend of the web application.
POST api/predict
Parameter | Type | Description |
---|---|---|
url |
string |
URL to prediction |
Insert gif or link to demo
To deploy this project run
npm run deploy
Add badges from somewhere like: shields.io