This project employs advanced machine learning methodologies to derive insights into student performance based on various sociodemographic indicators. Central to the project's efficiency and continuous integration is the integration with AWS CodePipeline, ensuring seamless updates and deployments. The aim is to predict students' math scores by analysing factors such as gender, ethnicity, parental education level, and more, thereby offering valuable insights into the determinants of academic achievement.
-
Data Ingestion: A robust data collection process lays the groundwork, providing a comprehensive view of relevant variables.
-
Data Transformation: Before analysis, data undergoes critical transformations, including handling outliers, normalization, and missing value treatments.
-
Model Training & Evaluation: Multiple algorithms were tested to pinpoint the model with optimal predictive accuracy. Post-training, a rigorous evaluation ensured the model's robustness and precision.
-
AWS CodePipeline Integration: The project leverages AWS CodePipeline for continuous integration and deployment. This ensures that updates, whether they're data changes or algorithm tweaks, are seamlessly integrated and deployed, keeping the project agile and up-to-date.
-
Deployment: The model is made accessible through AWS Elastic Beanstalk, facilitating real-time predictions for end-users.
The model's predictions hinge on the following independent variables:
- Gender: Male or Female.
- Ethnicity: Categories ranging from Group A to Group E.
- Parental Level of Education: Varying from high school to master's degree levels.
- Lunch Type: Standard or Free/Reduced.
- Test Preparation Course Status: Completed or None.
- Reading Score: A numerical score based on student assessments.
- Writing Score: Another numerical score reflecting student proficiency.
Through detailed analysis, the project reveals compelling insights into academic performance dynamics. Preliminary findings underscore the significant influence of factors such as parental education level and lunch type on a student's performance. These insights hold potential value for educators, policymakers, and parents, offering data-backed strategies to enhance student outcomes.