This is the project our team worked on for the final assignment of the "Data Science" course at UWO which I studied in the Fall of 2020. We used a data set with more than 20 features to find the best features and models that can label human voices as male or female in our records. The Logistic Regression was the best model with the highest accuracy. As expected frequency was the biggest contributor to the classification of voice. We tried other features, other models such as PCA or K-means, and also tried different bootstrapping sample sizes and plaid with many other methods. Finally, we concluded logistic Regression was the best model for this data set. I worked on the Logistic Regression model, plotting ROC and AUC graphs, writing the report, and analyzing our results. I contributed to 1/3 of this project. The report is 13 pages, please feel free to scroll down.