This project explores the application of both unsupervised and supervised learning techniques in analyzing Women’s Tennis Association (WTA) data.
The project aims to provide insights into WTA data using a combination of unsupervised and supervised learning methods.
👉 Presentation: Provides an overview of the project and its findings, offering a high-level perspective.
👉 Report: Offers a detailed analysis of the methodologies, results, and interpretations.
👉 Notebook: Contains the R code used for analysis, allowing for transparency and reproducibility.
The unsupervised aspect involves clustering the top 30 players, while the supervised part focuses on building predictive models for match outcomes. The dataset comprises a variety of player performance metrics, attributes, and match characteristics.
- Unsupervised Learning: Utilizes methods such as principal component analysis (PCA), k-medoids, and hierarchical clustering to uncover player segments and patterns within the data.
- Supervised Learning: Employs logistic regression, classification tree, and random forest models to predict match outcomes based on player and match attributes.
Women Tennis Association (WTA), Hierarchical Clustering, K-means, PCA, Classification Tree, Random Forest, Logistic Regression.