README.md

Lab5

The data consists of 2 feature columns. Dataset.

Upload the data. Construct a graphical representation of the experimental data (scatter plot). Visually assess the number of clusters, k, based on the constructed representation.
Develop the k-means clustering algorithm and implement it programmatically in MATLAB.
Perform cluster analysis on the original data using the k-means method (see method parameters in Table 5.2). Determine the most optimal number of clusters, k.
Calculate the centroids of the obtained clusters. Visualize the found clusters graphically (utilize a colored scatter plot).

The dataset was obtained from a txt file and converted into a pd.Dataframe for greater convenience.
A scatter plot was constructed for the original dataset.
The elbow method was applied to determine the optimal number of clusters. In this task, 4 clusters were identified.
The k-means method was implemented from scratch with Euclidean distance metric and intra-cluster sum of distances as the clustering quality metric.
Plots were generated for each step of the k-means algorithm.
A joint plot was created to visualize the final distribution of data across clusters.