This repository includes a series of exercises and examples that cover various machine learning algorithms and data manipulation techniques using Python and Scikit-learn. The goal is to provide a set of examples that demonstrate how to apply different algorithms and techniques to real-world datasets.
The project includes implementations of the following algorithms and techniques:
- Linear Regression
- Logistic Regression
- Polynomial Regression
- Support Vector Machines (SVM)
- Decision Trees
- Random Forests
- Naive Bayes
- Clustering Algorithms
- Isolation Forest
- Neural Networks
- Dataset splitting
- Data preparation and preprocessing
- Pipeline and transformer creation
- Model selection
- Feature selection
- Feature extraction with PCA
The repository includes two datasets used for testing the mentioned algorithms:
- Real email dataset: Located in
/datasets/
- NSL-KDD dataset (network packets): Located in
/datasets/nsl-kdd/
To run this project, you need to have the following installed:
- Python 3.x
- The dependencies listed in
requirements.txt
To install the dependencies, run the following command in your terminal:
pip install -r requirements.txt
## Running the Examples