Determinants of Earnings Analysis

Python data science project from Anglea Yu's course 100 Days of Code: Python on Udemy. This is a professional portfolio project to showcase what I learned from the 100 day challenge.

Project Details

Details
Link to Demo
Tools Used
What I learned

Details

This analysis digs into the Determinants of Earnings Datasets from Kaggle. Loads data in, cleans as necessary for NaN or null values, and explores the data via a series of questions.

Link to Demo

Kaggle Notebook

Tools Used

Python
Excel
numpy
pandas
plotly
matplotlib
seaborn

What I learned

Below are some code snippets I'm proud of from this project:

Answers the questions: How good our regression is also depends on the residuals - the difference between the model's predictions ( 𝑦̂ 𝑖 ) and the true values ( 𝑦𝑖 ) inside y_train. Do you see any patterns in the distribution of the residuals?

plt.figure(dpi=100)
plt.scatter(x=predicted_vals,
            y=residuals,
            c='indigo', 
            alpha=0.6)  

plt.title('Residuals vs. Predicted Values (Simple Linear Regression)', fontsize=17)
plt.xlabel('Predicted Earnings $\hat y _i$', fontsize=14)
plt.ylabel('Residuals', fontsize=14)

plt.axhline(y=0, color= 'r', ls='--')

plt.show()

Analyze the Estimated Values & Regression Residuals

# Original Regression of Actual vs. Predicted Prices
plt.figure(dpi=100)
plt.scatter(x=y_train, y=predicted_vals, c='indigo', alpha=0.6)
plt.plot(y_train, y_train, color='cyan')
plt.title(f'Actual vs Predicted Earnings: $y _i$ vs $\hat y_i$', fontsize=17)
plt.xlabel('Actual Earnings 000s $y _i$', fontsize=14)
plt.ylabel('Prediced Earnings 000s $\hat y _i$', fontsize=14)
plt.show()

# Residuals vs Predicted values
plt.figure(dpi=100)
plt.scatter(x=predicted_vals, y=residuals, c='indigo', alpha=0.6)
plt.title('Residuals vs Predicted Values', fontsize=17)
plt.xlabel('Predicted Earnings $\hat y _i$', fontsize=14)
plt.ylabel('Residuals', fontsize=14)
plt.show()

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

Determinants of Earnings Analysis

Project Details

Details

Link to Demo

Tools Used

What I learned

Files

README.md

Latest commit

History

README.md

File metadata and controls

Determinants of Earnings Analysis

Project Details

Details

Link to Demo

Tools Used

What I learned