diff --git a/README.md b/README.md index 94544ea..6342a10 100644 --- a/README.md +++ b/README.md @@ -20,10 +20,10 @@ The R library (`FourCePhase2.1Data`) in this repository contains functions that 1. **QC for Phase1.1**. We will check the following critera: + Demographics: - + if there is missing demographic groups (e.g., missing age group); + + if there is missing demographics groups (e.g., missing age group); + if there is negative patient numbres or counts; + if there is number of all patients less than the number of severe patients. - + if the sum of different demopgrahic groups is equal to the patient numbers for "all". Please note the QC only checks the groups without obfuscation (-99). For example, if the number of patients in age_group "0to25" is -99, then the QC will not check if the sum of all age_group is equal to the patient numbers for "all". + + if the sum of different demographics groups is equal to the patient numbers for "all". Please note the QC only checks the groups without obfuscation (-99). For example, if the number of patients in age_group "0to25" is -99, then the QC will not check if the sum of all age_group is equal to the patient numbers for "all". + Medications and Diagnoses: + if there is any code not belong to the list of medclass or ICD codes; + if there is negative patient numbers or counts; @@ -33,24 +33,24 @@ The R library (`FourCePhase2.1Data`) in this repository contains functions that + if there is negative patients nubmers; + if there is negative measures in the original scale. + Cross over comparison: - + if the number of patients in DailyCounts, ClincalCourse, and Demographic are consistent; - + if the number of severe patients in DailyCounts, ClinialCourse, and Demographic are consistent; + + if the number of patients in DailyCounts, ClincalCourse, and Demographics are consistent; + + if the number of severe patients in DailyCounts, ClinialCourse, and Demographics are consistent; + if in any of Medications, Diagnoses or Labs, the number of patients with the code is greater than number of all patients. + Lab units: check if the lab measures are consistently outside the confidence range. 2. **QC for Phase2.1**. We will check if the summary statistics obtained from Phase 2.1 patient-level data is consistent with Phase1.1 aggregated data: -+ Demographcis: - + if there is duplicated row for the same demographic group; ++ Demographics: + + if there is duplicated row for the same demographics group; + if n_all_before, n_all_since, n_severe_before, n_severe_since are different between phase2.1 and phase1.1; + Labs: + if there is duplicated row for the same lab on the same day but with different counts/measures; - + if n_all, mean_all, stdev_all, n_severe, mean_severe, stdev_severe at day0 are different between phase2.1 and phase1.1. Because the definitions of the date at admission are slightly different between Phase1.1 and Phase2.1, an error of less than 1% is allowed. For example, n_all from phase1.1 should be within the range (0.99n_all,1.01n_all) from phase2.1 + + if n_all, mean_all, stdev_all, n_severe, mean_severe, stdev_severe at day0 are different between phase2.1 and phase1.1. Because the definitions of the date at admission are slightly different between Phase1.1 and Phase2.1, an error of less than 1% is allowed. For example, n_all from phase1.1 should be within the range (0.99n_all,1.01n_all) from phase2.1. + Medications: + if there is duplicated row for the same MEDCLASS on the same day but with different counts; - + if n_all_before, n_all_since, n_severe_before, n_severe_since are different between phase2.1 and phase1.1. Because the definitions of the date at admission are slightly different between Phase1.1 and Phase2.1, an one day leevay is allowed. For example, n_all_since from phase1.1 should be within the range (n_all for day>0, n_all for day>=0) from phase2.1. + + if n_all_before, n_all_since, n_severe_before, n_severe_since are different between phase2.1 and phase1.1. Because the definitions of the date at admission are slightly different between Phase1.1 and Phase2.1, an one day leeway is allowed. For example, n_all_since from phase1.1 should be within the range (n_all for day>0, n_all for day>=0) from phase2.1. + Diagnoses: + if there is duplicated row for the same ICD code on the same day but with different counts; - + if n_all_before, n_all_since, n_severe_before, n_severe_since are different between phase2.1 and phase1.1. Because the definitions of the date at admission are slightly different between Phase1.1 and Phase2.1, an one day leevay is allowed. For example, n_all_since from phase1.1 should be within the range (n_all for day>0, n_all for day>=0) from phase2.1. + + if n_all_before, n_all_since, n_severe_before, n_severe_since are different between phase2.1 and phase1.1. Because the definitions of the date at admission are slightly different between Phase1.1 and Phase2.1, an one day leeway is allowed. For example, n_all_since from phase1.1 should be within the range (n_all for day>0, n_all for day>=0) from phase2.1. + ClinicalCourse: + if there is duplicated row for days_since_admission but with different counts; + if num_patients_all_still_in_hospital, num_patients_ever_severe_still_in_hospital are different between phase2.1 and phase1.1;