This repository contains a Python project that performs K-Means clustering on the Iris dataset. The project involves finding the optimal number of clusters and visualizing the results.
The objective of this project is to apply the K-Means clustering algorithm to the Iris dataset to identify distinct clusters within the data. Key steps include:
- Loading the Dataset: The Iris dataset is loaded from a CSV file.
- Determining Optimal Clusters:
- Elbow Method: Helps to find the optimal number of clusters by plotting inertia.
- Silhouette Score: Evaluates the quality of clustering for different numbers of clusters.
- Fitting the Model: The K-Means model is fitted with the optimal number of clusters.
- Visualizing Clusters: Clusters are visualized to interpret the results.
iris.csv
: The dataset used for clustering.predict.py
: The main script for performing clustering, determining optimal clusters, and visualizing results.
To run this project, you need to have Python installed. You also need to install the following packages:
pip install numpy pandas matplotlib scikit-learn