This is the project for statistical analysis for course of statistics.It provides the begginer approach of analysis
This project gives the information about the analysis of Haberman's survival data set.The dataset contains cases from a study that was conducted between 1958 and 1970 at the University of Chicago's Billings Hospital on the survival of patients who had undergone surgery for breast cancer.
Number of Instances: 306 Number of Attributes: 4 (including the class attribute)
Attribute Information:
- Age of patient at time of operation (numerical)
- Patient's year of operation (year - 1900, numerical)
- Number of positive axillary nodes detected (numerical)
- Survival status (class attribute) 1 = the patient survived 5 years or longer 2 = the patient died within 5 year
Dependent variable:Survaival data
Independent variable:Age,Year of Operation,No. of positive axillary nodes
As we have the information about the Haberman's survival dataset,we have to check the effect of dependent variable and independent variable.The Hypothesis will like this-
Ho=No Independent variable affect to dependent variable.
Ha=There are significant variable which affect on dependent variable.
As we have only one sample.This dataset contains the categorical data.So I choose the Chi-Squred Gopdness of fit test.As we don't know the mean of the population we can't use z-test.So I use Graphpad software to calculate chi-square.
On the basis of test we reject the null hypothesis.In this analysis dependent variable can be affected by two independent variable.
On the basis of the result we can conclude that chi-square test between age and survival reject the null hypothesis.It means that age can not affect on survival.Also Chi-square test between year of operation and survival reject the null hypothesis.