Skip to content
Jeya Balaji Balasubramanian edited this page Jan 16, 2023 · 11 revisions

Welcome to the py-icare wiki!

Example data

Example datasets are provided at the data/ directory of this repository. Users can use them to explore the different features in iCARE and the output that it generates.

Variable name Description Value encoding
id Subject ID A unique identifier for each individual.
famhist Family history (of breast cancer among first degree relatives) {0: "absence", 1: "presence"}
menarche_dec Age at menarche (years) {1: <=11 (reference), 2: 11-11.5, 3: 11.5-12, 5: 12-13, 8: 13-14, 9: 14-15, 10: >=15}
parity Parity (number of full-term pregnancies) {0: nulliparous (reference), 1: 1, 2: 2, 3: 3, 4: >=4}
birth_dec Age at first child birth (years) {1: <=19 (reference), 2: 19-22, 3: 22-23, 4: 23-25, 7: 25-27, 8: 27-30, 9: 30-34, 10: 34-38, 11: >=38}
agemeno_dec Age at menopause (years) {1: <=40 (reference), 2: 40-45, 3: 45-47, 4: 47-48, 5: 48-50, 6: 50-51, 7: 51-52, 8: 52-53, 9: 53-55, 10: >=55}
height_dec Height (meters) {1: <=1.55 (reference), 2: 1.55-1.57, 3: 1.57-1.60, 4: 1.60-1.61, 5: 1.61-1.63, 6: 1.63-1.65, 7: 1.65-1.66, 8: 1.66-1.68, 9: 1.68-1.71, 10: >=1.71}
bmi_dec Body mass index (kg/m2) {1: <=21.5 (reference), 2: 21.5-23, 3: 23-24.2, 4: 24.2-25.3, 5: 25.3-26.5, 6: 26.5-27.8, 7: 27.8-29.3, 8: 29.3-31.4, 9: 31.4-34.6, 10: >=34.6}
rd_menohrt Use of Hormone Replacement Therapy (HRT) {0: "pre-menopausal" (reference), 1: "post-menopausal and never HRT user", 2: "post-menopausal and ever HRT user"}
rd2_everhrt_e Use of estrogen-only therapy {0: "otherwise", 1: "post-menopausal and ever user of estrogen-only therapy"}
rd2_everhrt_c Use of estrogen + progesterone combined therapy {0: "otherwise", 1: "post-menopausal and ever user of combined therapy"}
rd2_currhrt Current use of HRT {0: "otherwise", 1: "post-menopausal and current HRT user"}
alcoholdweek_dec Alcohol (drinks/week) {1: "none" (reference), 4: 0-0.4, 5: 0.4-0.8, 6: 0.8-1.5, 7: 1.5-3.2, 8: 3.2-5.7, 9: 5.7-9.8, 10: >9.8}
ever_smoke Smoking status {0: "never", 1: "ever"}
  • validation_cohort_data.csv: a simulated dataset of a full cohort study of 50,000 women. This dataset helps illustrate the model validation capabilities of iCARE. The variables in this dataset are as follows.

  • validation_nested_case_control_data.csv: a simulated dataset of a case-control study of 5,285, nested within a cohort study ()

Clone this wiki locally