Skip to content

Data Manipulation

Jana Gombitova edited this page Nov 9, 2016 · 1 revision

recompute summaries

To recompute all surveyQuestionSummary objects, use: http://flowaglimmerofhope.appspot.com/webapp/testharness?action=rebuildQuestionSummary&bypassBackend=false This will run on a backend, so there is no risk of timeout.

for a single survey: http://flowaglimmerofhope.appspot.com/webapp/testharness?action=rebuildQuestionSummary&surveyId=507048&bypassBackend=true

fixOptions2Values

This action fixes wrong Types in questionAnswerStore objects. Usage: http://FLOWINSTANCE_HOST/rest/actions?action=fixOptions2Values

The problem this action addresses is that when cleaned data is uploaded using an excel file, the type of the answer is set according to the type of the question, while the device sets the type according to a different convention.

The action runs through all the questionAnswerStore objects in the data store and makes the following substitutions in the type field:

  • PHOTO => IMAGE
  • VIDEO => VIDEO
  • FREE_TEXT => VALUE
  • OPTION => VALUE
  • GEO => GEO
  • DATE => DATE
  • NUMBER => VALUE
  • SCAN => VALUE

The action handles 500 items in one call, and invokes new tasks as necessary if there are more items.

Pdf generation scripts

For PDF generation work, we experiment with python to manipulate data and create pdfs using the ReportLab library.

The script is called like this: python create_report.py data.csv

  1. the script uses the reportlab pdf building library, so that needs to be installed. http://www.reportlab.com/software/opensource/

  2. the csv file is created from the the raw data report. It needs to be in the same directory as the python script

  3. the data mapping is based on the columns of the verification survey. This needs to be tested: I am not sure if the right columns are displayed, and if the words are correct.

  4. if a report needs to be made for the other types of surveys, you will need to adapt the mapping to the rows in the csv, and the wording of the columns

  5. before downloading the image from S3, the script tests if it exists at that location. If it doesn't, it substitutes the image with name image.jpg instead, which it expect to find in the same directory. It is probably better to write an adapted script which first checks if there are any missing images, so the enumerators / field managers can try to get them of the phones before the report is generated.

Clone this wiki locally