This report employs the following methodologies to analyze and visualize data in the field of health insurance in the United States:
- Multivariate Analysis: Utilizes statistical tests such as Z-test, Chi-square, and ANOVA to examine differences and correlations among variables and categorical variables.
- Linear Regression Modeling: Applied to predict the growth trend of the number of insured customers based on past data.
- Data Visualization: Various types of charts such as bar charts, line charts, box plots, and map charts are used to vividly illustrate information and data analysis results for easy comprehension. (Note: The content of the report is written entirely in Vietnamese)
To see detailed content, please visit: [Here]
To analyze the factors influencing the cost that customers have to pay for individual health insurance.
Python language is used for data analysis and visualization, applying statistical methods, testing, and data modeling.
- Check the effectiveness of changing data variables:
- Check for Overfitting:
Trends, relationships, and differences in health insurance costs among customer groups according to factors such as age, gender, body mass index, number of children, smoking, region, and state were identified.
- Time series analysis:
- Area/ Density analysis:
- Health & Family analysis:
- Meta-analysis:
Strategies and solutions are proposed to improve service quality, optimize business strategies, and ensure benefits for both customers and the insurance company.