You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Run a job that checks through the raw data files on S3 and the output files on RDS to answer the following questions:
Question checks (requires resolution of raw data S3 Issue)
Return list of questions present in raw data not present in output data
Return list of questions present in output data not present in raw data
Response checks (requires resolution of raw data S3 Issue)
Return list of responses present in raw data not present in output data
Return list of responses present in output data not present in raw data
Time series continuity checks
Return question occurrences showing gaps in the time series, end point of the series, and start point of the series. A table in this format: Columns: question variable, week, reported_in_survey (new column with a flag if the question is reported)
Proportion checks
Report questions and time periods when the responses do not sum to 100%. For question_type == "Select one" or "Yes / No" summing the response value proportions by variable question ID and crosstab subgroup should give 100%.
When the question_type == "Select all" will equal 100% when each response item is summed with -99 response values.
Unexpected value checks
Check for missing values like NaN, 0, -99, -88, , etc. Return questions and date when this is the case.
The output of this job can be a log or txt file containing descriptive information written to S3 or emailed in an attachment.
The text was updated successfully, but these errors were encountered:
Start with Time series continuity checks, Proportion checks, and Unexpected value checks requires read_pulse like function described in download data from SQL DB #27
Run a job that checks through the raw data files on S3 and the output files on RDS to answer the following questions:
Question checks (requires resolution of raw data S3 Issue)
Response checks (requires resolution of raw data S3 Issue)
Time series continuity checks
variable
,week
,reported_in_survey
(new column with a flag if the question is reported)Proportion checks
question_type
== "Select one" or "Yes / No" summing the response value proportions by variable question ID and crosstab subgroup should give 100%.question_type
== "Select all" will equal 100% when each response item is summed with-99
response values.Unexpected value checks
NaN
,0
,-99
,-88
,The output of this job can be a log or txt file containing descriptive information written to S3 or emailed in an attachment.
The text was updated successfully, but these errors were encountered: