An FME- and Python based workflow to import ZTRAX into a set of SQLite databases, and from there into county-level CSV files.
Johannes H. Uhl, Earthlab, Cooperative Institute for Research in Environmental Sciences (CIRES) & Institute of Behavioral Science (IBS), University of Colorado Boulder, December 2021.
Requirements: Python 3.7 & Safe Software Feature Manpulation Engine (tested for v2017 and v2019).
- Extract ZTRAX ZIP into subfolder.
- Extract each state zip into a state subfolder (eg all csvs in 01.zip must be in \01*.csv).7-ZIP can do that using the Extract to * command.
- Copy these subfolders in a separate folder “CSV”. Check the scripts and create all subfolders specified, and adjust paths.
- Prepare script 01_generate_SQLite_databases_separate.py: Set zipfolder path and dbpath. Also, place the metadata files in the same directory as the script.
- Open the Layout.xlsx and copy the ZAsmt tab into a csv called Layout_ZAsmt.csv in the scripts folder – likewise for zTrans. Also copy the
- Run 01_generate_SQLite_databases_separate.py. This will generate empty SQLITE databases in the db_path folder.
- Run 02_append_headers_to_csv_files.py. Set paths in script prior to that. This will append the column headers to each csv file.
- In the first version imported in 2017, some “NULL” characters in specific csv files caused FME to crash. The script 03_remove_NUL_characters.py will search for these characters and replace them by “0”. If these errors occur in 10), this script can be used to clean the csv files.
- Copy the two .fmw FME workbench files into a subfolder .\TEMPLATES. Then adjust paths in the script 04_generate_batch_workspaces.py. this script will take the FMW files in the TEMPLATE folder, and adjust the paths for each state-level database. It also creates a batchfile in .\BATCH_all.bat.
- Run the _all.bat. This will import the data into SQLite databases (XX_ZTrans_cont_SQLite.db and XX_ZAsmt_cont_SQLite.db, for each state ID XX).
- Run 05_ZTRAX_set_index.py. This will set indices on database columns to speed up subsequent extractions.
- Run 06_ZTRAX_ZAsmt_relevant_attributes_SQLite2CSV.py to generate county-level extractions of ZTRAX attributes of interest. Implemented only for ZAsmt and for a set of around 50 specific attributes from different ZAsmt tables. There will be one csv files per county, at the building area level (UID: RowID + BuildingOrImprovementNumber + BuildingAreaStndCode). From there, users can aggregate the data to the building and property level.