-
Notifications
You must be signed in to change notification settings - Fork 0
/
Copy pathreadme
56 lines (46 loc) · 2.57 KB
/
readme
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
# Intrusion Detection System (IDS) Web Application
This web application allows users to upload test files, process them using machine learning algorithms, and view predictions as well as download result files. The project leverages Django for web handling, file storage, and integrates machine learning using libraries like Pandas, Scikit-learn, XGBoost, and more.
## Key Features
- **File Upload and Handling**: Utilizes Django's `FileSystemStorage` for managing uploaded files.
- **One-Hot Encoding**: Encodes categorical features such as `protocol_type`, `service`, and `flag`.
- **Data Scaling**: Employs `RobustScaler` for feature normalization.
- **Voting Classifier Ensemble**: Combines Decision Tree, Random Forest, and Naive Bayes classifiers for predictions.
- **Visual Representation**: Displays predicted labels using Seaborn bar plots.
- **CSV Download**: Allows users to download prediction results as a CSV file.
## Views
### `home(request)`
Renders the home page of the application.
### `predict(request)`
Handles the following:
1. User file upload.
2. Data preprocessing (one-hot encoding and scaling).
3. Application of a Voting Classifier ensemble for predictions.
4. Visualization of results (bar plot of predicted labels).
5. Saving prediction results to a CSV file.
6. Rendering the results page, including the plot and download link for the CSV file.
### `download(request)`
Facilitates downloading predicted labels in CSV format.
## ML Model Pipeline
1. **Data Preprocessing**: One-hot encoding of categorical features.
2. **Data Scaling**: Utilizes `RobustScaler`.
3. **Model Training**: Employs a Voting Classifier combining:
- Decision Tree
- Random Forest
- Naive Bayes
4. **Prediction and Visualization**: Outputs predictions and generates visualizations.
5. **CSV Download**: Provides results in a downloadable CSV format.
## Libraries and Tools Used
- **Django**: For the web framework and file handling.
- **Pandas**: For data manipulation.
- **Scikit-learn**: For the machine learning pipeline (training, prediction, preprocessing).
- **XGBoost**: For boosted tree models.
- **Seaborn and Matplotlib**: For visualizations.
- **Joblib**: For model saving/loading.
- **NumPy**: For numerical operations.
## How to Use
1. Visit the home page of the application.
2. Upload a test file in CSV format using the provided form.
3. Click submit; the application will process the file using the machine learning model.
4. View the results, download the CSV file, and check the plot of predicted labels.
## File Download
Users can download prediction CSV files using the download endpoint.