Skip to content

This repository contains an EDA notebook analyzing regional and demographic factors contributing to educational disparities in the U.S., highlighting trends and key insights.

Notifications You must be signed in to change notification settings

ClaudiaYang/EDA-Project-in-R

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

54 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Exploratory Data Analysis (EDA) Code Notebook

Project: Analyzing the Impact of Regional and Demographic Factors on Education Inequality

Last Updated: December 11, 2024

Team Members and Affiliations

Kevin Xie, Fay Yan, Claudia Yang, Karl Zhou
QTM 302W, Technical Writing, Emory University

Objectives

The primary objectives of this project are:

  1. To analyze the impact of regional and demographic factors on educational disparities in the U.S.
  2. To identify patterns in SAT scores and GPAs across different socioeconomic backgrounds.
  3. Conducting descriptive statistical analysis of the dataset to summarize key trends.
  4. Visualizing the data through appropriate plots and charts to uncover insights.
  5. Identifying missing values, outliers, and data anomalies to ensure data quality.
  6. Documenting the analysis process for reproducibility and collaboration.

Project Description

This repository contains the Exploratory Data Analysis (EDA) Code Notebook, developed as part of our research to understand disparities in educational outcomes across the U.S. The analysis focuses on identifying patterns, trends, and relationships within the dataset to inform decision-making and highlight key insights.

Research Questions

What regional and demographic factors contribute most significantly to disparities in educational outcomes in the U.S.?
How do socioeconomic factors influence SAT scores and GPAs?

Methods Used

Quantitative Analysis: Statistical summaries and tests to uncover relationships between variables.
Visualization & Spatial Analysis: Graphical plots and maps to visualize trends across regions and demographics.
Statistical Testing: Hypothesis testing to validate findings.

Results / Key Findings

Regional Variations: Significant differences in SAT scores and GPAs were observed across regions, with urban and suburban areas outperforming rural areas.
Socioeconomic Influence: Higher household incomes and parental education levels strongly correlate with improved SAT scores and GPAs.
Demographic Trends: Disparities exist across racial and ethnic groups, emphasizing systemic inequalities.

Future Directions

Comparison of pre and post-COVID-19 data to assess the pandemic's impact on educational disparities.
Exploration of policy interventions aimed at reducing disparities.

How to Use the Repository

Prerequisites

R (for statistical analysis)
RStudio (integrated development environment)
Binder (for reproducibility and sharing)
Required libraries for EDA (e.g., ggplot2, dplyr, tidyr, summarytools).

Instructions

  1. Clone this repository:
git clone https://github.com/ClaudiaYang/EDA-Project-in-R.git
cd EDA-Project-in-R
  1. Open the EDA_Codebook.Rmd file in RStudio or render the EDA_Codebook.html file for a preview.
  2. To reproduce the results, execute the code blocks sequentially in RStudio.

Rendering the Notebook

To generate the HTML file from EDA_Codebook.Rmd, run the following command in R:

rmarkdown::render("EDA_Codebook.Rmd")

About

This repository contains an EDA notebook analyzing regional and demographic factors contributing to educational disparities in the U.S., highlighting trends and key insights.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 4

  •  
  •  
  •  
  •