Skip to content

Process for adding new DSRA scenarios

William Chow edited this page Nov 30, 2023 · 18 revisions

Process for Adding New DSRA Scenarios

Status: Work-in-progress early draft, last updated on June 8, 2023

Please click on ▸/▾ to expand/collapse each section.

1. Design a new DSRA scenario and run OpenQuake

This is usually done by our Seismic Risk Research Scientist Tiegan E. Hobbs and helpers (e.g. Earthquake Scenarios Analyst Jeremy Rimando).

  1. Design new DSRA scenarios.

  2. Run OpenQuake (and related scripts) to generate scenario Hazard, Damage, Consequence, and Risk outputs using the Canadian datasets and functions produced by NRCan and partners.

    This is done by running scripts/run_OQStandard.sh which calls OpenQuake (currently v3.11) and other post-processing Python scripts (consequences? weighted average?) to complete calculations resulting in HDF5 DataStores (in oqdata) and CSV files.

    This can be run on any compatible machine, whether your local computer (e.g. personal MacBook, physical Linux installation, Linux in WSL2, Linux in VirtualBox, etc.) or in an AWS EC2 instance.

  3. Upload (git push) the results to the FINISHED directory in the OpenDRR/earthquake-scenarios repository.

Please refer to DraftOF_RunningOQCanada.pdf (Hobbs, T.E., Journeay, J.M., Rotheram, D., 2022?. A Mini Manual for Running OpenQuake Canada, Draft Open File) for more information. The following table is extracted from page 5 of that manual, with minor modifications.

Table 1: An example set of the files used to run OpenQuake.

Repository OpenDRR/earthquake-scenarios (public)
Run Script scripts/run_OQStandard.sh
Hazard Initialization initializations/s_Hazard_SIM9p0_CascadiaInterfaceBestFault.ini
Damage Initialization initializations/s_Damage_SIM9p0_CascadiaInterfaceBestFault_b0_b.ini
initializations/s_Damage_SIM9p0_CascadiaInterfaceBestFault_r1_b.ini
Risk Initialization initializations/s_Risk_SIM9p0_CascadiaInterfaceBestFault_b0_b.ini
initializations/s_Risk_SIM9p0_CascadiaInterfaceBestFault_r1_b.ini
Rupture ruptures/rupture_SIM9p0_CascadiaInterfaceBestFault.xml
Consequence scripts/consequences-v3.10.0.py
scripts/Hazus_Consequence_Parameters.xlsx
Repository OpenDRR/openquake-inputs (public)
Sites earthquake/sites/regions/site-vgrid_BC.csv
Fragility earthquake/vulnerability/structural_fragility_CAN.xml
Vulnerability earthquake/vulnerability/vulnerability_contents_CAN.xml
earthquake/vulnerability/vulnerability_nonstructural_CAN.xml
earthquake/vulnerability/vulnerability_occupants_CAN.xml
earthquake/vulnerability/vulnerability_structural_CAN.xml
Mapping earthquake/vulnerability/CanSRM1_TaxMap_b0.csv
earthquake/vulnerability/CanSRM1_TaxMap_r1.csv
Exposure exposure/general-building-stock/oqBldgExp_BC.xml
exposure/general-building-stock/oqBldgExp_BC.csv
Repository OpenDRR/CanadaSHM6 (currently in private trial)
GPME OpenQuake_model_files/gmms/LogicTree/OQ_classes_NGASa0p3weights_interface.xml

2. Fix-up and create PR for new DSRA scenarios (e.g. by Jeremy)

Background

  • Anthony simply duplicated Jeremy Rimando Google Drive structure and made no attempt to make it fit OpenDRR/earthquake-scenarios
  • Damon created OpenDRR/DSRA-processing to automate the fix-up and addition of these scenarios to OpenDRR/earthquake-scenarios
  • Tiegan checked the validity of each scenarios, and determined that 17 of the scenarios are good for publishing right now.
  • 4 of the scenarios were added in October (?) 2022; 3 of which are subsequently renamed (awaiting PR review in late March 2023):
    1. ACM7p4_BurwashLanding ACM7p4_DenaliFault (No. 3)
    2. ACM4p9_Capilano5 ACM4p9_GeorgiaStraitFault (No. 4)
    3. SCM5p0_Montreal (No. 7)
    4. SCM5p5_Ottawa SCM5p5_ConstanceBay (No. 9)
  • 13 are being added (March – April 2023), see https://github.com/OpenDRR/earthquake-scenarios/issues/74#issuecomment-1410705741

DSRA-processing

For processing of new scenarios before moving them to earthquake-scenarios repo

Steps:

  1. Create new branch - use format "scenarios-Y.M.D.Time", ex. scenarios-2022.09.13.1126

  2. Add scenario files to their respective directories:

    • 7 files (.csv or .xz) per scenario to /FINISHED: s_consequnces b0 and r1, s_dmgbyasset b0 and r1, s_lossesbyasset b0 and r1, and s_shakemap
    • 5 files (.ini) per scenario to /initializations: s_Damage b0 and r1, s_Hazard, s_Risk b0 and r1
    • One rupture_ .xml file per scenario to /ruptures
  3. Delete the README.md files in those three directories.

  4. Push new branch, github action will process the files, and open PR to merge new scenarios in earthquake-scenarios repo.

  5. Branch can then be used for stack build to generate data for added scenarios.

What Anthony learned from Damon’s observations and insights:

  • "One rupture_ .xml file per scenario to /ruptures, so that's why Tiegan uses the rupture file name (and not the initialization files) to refer to certain events!

Brainstorming: How Anthony deviates from Damon’s procedure (or to elaborate on the procedure)

  • Instead of scenarios-Y.M.D.Time, I tend to use what I call the "codename" of the event.
  • Try to create a branch for each scenario (like Damon does), but will see if I can "stack" (rebase) them on top of one another to create one big PR
  • If there are any large files, may need to commit them over several commits... (maybe not needed here because we are using Git LFS?) to be confirmed
  • TODO

How did Damon do it for the 6 5 new scenarios added in October 2022?

Summary:

@DamonU2 added five new scenarios in PR OpenDRR/earthquake-scenarios#69 and removed "Duncan" in OpenDRR/earthquake-scenarios#70 circa October 6 - 13, 2022.

Details (Retracing):

At OpenDRR/earthquake-scenarios:

How is Anthony doing it for the 13 new scenarios

(Note: It is a bit manual and ad hoc (?), but we may be able to streamline/automate the process in the future.)

  1. In the ~/NRCan/OpenDRR/earthquake-scenarios-2022-jrimando-google-drive directory, run csv-to-yaml.py (which was created by telling ChatGPT to “Write a Python script that converts CSV with rows containing specific value to YAML using ruamel.yaml” and making minor fixes and modifications:

    #!/usr/bin/python3
    
    import csv
    import ruamel.yaml
    
    # Open the CSV file
    with open('ScenarioReview_TH_Oct2022-2023-02-22.csv', newline='') as csvfile:
        # Read the CSV data into a list of dictionaries
        reader = csv.DictReader(csvfile)
    
        # Filter out already committed scenarios and non-approved scenarios
        data = []
        for row in reader:
            if row['Scenario Num'] not in ['3', '4', '7', '9', '']:
                data.append(row)
    
    # Convert the data to YAML format using ruamel.yaml
    yaml = ruamel.yaml.YAML()
    yaml.indent(mapping=2, sequence=4, offset=2)
    
    # Write the YAML data to a file
    with open('output.yaml', 'w') as yamlfile:
        yaml.dump(data, yamlfile)

    Then, do some quick sanity checks:

    $ grep EVENT: output.yaml
      - EVENT: rupture_ACM5p2_Abbotsford_syst.xml
      - EVENT: rupture_ACM5p0_Burnaby_syst.xml
      - EVENT: rupture_ACM5p7_Ladysmith_syst.xml
      - EVENT: rupture_ACM4p9_MatsquiMain2_syst.xml
      - EVENT: rupture_SCM5p9_Montreal_syst.xml
      - EVENT: rupture_SCM5p6_Ottawa_syst.xml
      - EVENT: rupture_ACM5p2_PortAlberni_syst.xml
      - EVENT: rupture_ACM7p7_QueenCharlotte_syst.xml
      - EVENT: rupture_ACM5p0_Richmond_syst.xml
      - EVENT: rupture_ACM8p0_SkeenaQueenCharlotteE_syst.xml
      - EVENT: rupture_SCM5p0_Toronto_BTSZ_syst.xml
      - EVENT: rupture_SCM5p0_Toronto_CMMBZ_syst.xml
      - EVENT: rupture_ACM5p5_Tsussie6_syst.xml
    $ grep EVENT: output.yaml | wc
         13      39     607

Get the actual “old” codenames named by Jeremy:

$ grep EVENT: output.yaml | cut -f2- -d_ | sed 's/_syst.xml$//' > 13-scenarios-jeremy-codename.txt
$ cat 13-scenarios-jeremy-codename.txt
ACM5p2_Abbotsford
ACM5p0_Burnaby
ACM5p7_Ladysmith
ACM4p9_MatsquiMain2
SCM5p9_Montreal
SCM5p6_Ottawa
ACM5p2_PortAlberni
ACM7p7_QueenCharlotte
ACM5p0_Richmond
ACM8p0_SkeenaQueenCharlotteE
SCM5p0_Toronto_BTSZ
SCM5p0_Toronto_CMMBZ
ACM5p5_Tsussie6

More checks: To see if we have an identical number of files related to each scenario:

$ for i in $(cat 13-scenarios-jeremy-codename.txt); do printf "%-28s" "$i"; find . -name "*${i}*" -type f | wc; done
ACM5p2_Abbotsford                19      19    2197
ACM5p0_Burnaby                   19      19    2050
ACM5p7_Ladysmith                 19      19    2142
ACM4p9_MatsquiMain2              19      79    2385
SCM5p9_Montreal                  19      19    2088
SCM5p6_Ottawa                    19      19    1995
ACM5p2_PortAlberni               19      37    2279
ACM7p7_QueenCharlotte            19      37    2429
ACM5p0_Richmond                  19      19    2096
ACM8p0_SkeenaQueenCharlotteE     19      91    2898
SCM5p0_Toronto_BTSZ              19      19    2227
SCM5p0_Toronto_CMMBZ             19      19    2272
ACM5p5_Tsussie6                  19      61    2147

Check

$ find . -name '*ACM5p2_Abbotsford*' -type f
./outputs/Systematic/Abbotsford-Systematic/ACM5p2_Abbotsford_syst/s_sitemesh_ACM5p2_Abbotsford_syst_1.csv.xz
./outputs/Systematic/Abbotsford-Systematic/ACM5p2_Abbotsford_syst/s_gmfdata_ACM5p2_Abbotsford_syst_1.csv.xz
./outputs/Systematic/Abbotsford-Systematic/ACM5p2_Abbotsford_syst/s_lossesbyasset_ACM5p2_Abbotsford_syst_r1_5_b.csv.xz
./outputs/Systematic/Abbotsford-Systematic/ACM5p2_Abbotsford_syst/s_Damage_ACM5p2_Abbotsford_syst_b0r1_b.log
./outputs/Systematic/Abbotsford-Systematic/ACM5p2_Abbotsford_syst/s_dmgbyasset_ACM5p2_Abbotsford_syst_r1_3_b.csv.xz
./outputs/Systematic/Abbotsford-Systematic/ACM5p2_Abbotsford_syst/s_consequences_ACM5p2_Abbotsford_syst_b0_2_b.csv.xz
./outputs/Systematic/Abbotsford-Systematic/ACM5p2_Abbotsford_syst/s_Risk_ACM5p2_Abbotsford_syst_b0r1_b.log
./outputs/Systematic/Abbotsford-Systematic/ACM5p2_Abbotsford_syst/s_Hazard_ACM5p2_Abbotsford_syst.log
./outputs/Systematic/Abbotsford-Systematic/ACM5p2_Abbotsford_syst/s_consequences_ACM5p2_Abbotsford_syst_r1_3_b.csv.xz
./outputs/Systematic/Abbotsford-Systematic/ACM5p2_Abbotsford_syst/s_shakemap_ACM5p2_Abbotsford_syst_1.csv.xz
./outputs/Systematic/Abbotsford-Systematic/ACM5p2_Abbotsford_syst/s_dmgbyasset_ACM5p2_Abbotsford_syst_b0_2_b.csv.xz
./outputs/Systematic/Abbotsford-Systematic/ACM5p2_Abbotsford_syst/s_lossesbyasset_ACM5p2_Abbotsford_syst_b0_4_b.csv.xz
./initializations/Systematic/rupture_ACM5p2_Abbotsford_syst.csv
./initializations/Systematic/Abbotsford_ScenarioIni_Files_Systematic/ACM5p2_Abbotsford_syst/s_Damage_ACM5p2_Abbotsford_syst_b0_b.ini
./initializations/Systematic/Abbotsford_ScenarioIni_Files_Systematic/ACM5p2_Abbotsford_syst/s_Risk_ACM5p2_Abbotsford_syst_b0_b.ini
./initializations/Systematic/Abbotsford_ScenarioIni_Files_Systematic/ACM5p2_Abbotsford_syst/rupture_ACM5p2_Abbotsford_syst.xml
./initializations/Systematic/Abbotsford_ScenarioIni_Files_Systematic/ACM5p2_Abbotsford_syst/s_Risk_ACM5p2_Abbotsford_syst_r1_b.ini
./initializations/Systematic/Abbotsford_ScenarioIni_Files_Systematic/ACM5p2_Abbotsford_syst/s_Damage_ACM5p2_Abbotsford_syst_r1_b.ini
./initializations/Systematic/Abbotsford_ScenarioIni_Files_Systematic/ACM5p2_Abbotsford_syst/s_Hazard_ACM5p2_Abbotsford_syst.ini

Copy the content of 13-scenarios-jeremy-codename.txt into the SCENARIOS_JEREMY_CODENAME SCENARIOS array definition in ~/NRCan/OpenDRR/DSRA-processing/copy-new-scenarios-from-jrimando-repo.sh.

Disable the git commit and push stuff. We’ll be doing that manually.

After fixing up copy-new-scenarios-from-jrimando-repo.sh to my satisfaction, I ran it, and it copied the files for all 13 scenarios.

Thanks to Damon’s helpful tips in README.md, we know that we should have:

  • 7 × 13 = 91 files in FINISHED/ directory
  • 5 × 13 = 65 files in initializations/ directory
  • 1 × 13 = 13 files in ruptures/ directory

I then run:

xz -dv FINISHED/*.xz

to decompress the *.csv.xz files to *.csv.

Prepare for commits

Create our new branch add-13-scenarios-apr2023 on top of the rename-and-organize branch which is awaiting review and approval.

git switch rename-and-organize
git switch -c add-13-new-scenarios

Move everything over to OpenDRR/earthquake-scenarios:

cd ../DSRA-processing
mv -vi FINISHED/*.csv ../earthquake-scenarios/FINISHED/
mv -vi initializations/*.ini ../earthquake-scenarios/initiailizations/
mv -vi ruptures/*.xml ../earthquake-scenarios/ruptures/

Back at OpenDRR/earthquake-scenarios, check if the files are copied correctly:

(7 + 5 + 1) files/scenario × 13 scenarios
= 13 files/scenario × 13 scenarios
= 169 files

$ git status | egrep 'FINISHED|initializations|ruptures' | wc
    171     171    9735
(I had two other unrelated files lying around)

git add

git add FINISHED/*.csv initializations/*.ini ruptures/*.xml

This will take some time (almost 7 minutes on my computer over NFS) due to git-lfs copying the CSV files into .git/lfs/objects. The directory will grow quite a bit in size, 13GiB perhaps?

Check number of files again:

$ git status | grep "new files:" | wc
    169     507   11714

git commit

Prepare for commit message:

ACM5p2_Abbotsford            → ACM5p2_VedderFault
ACM5p0_Burnaby               → ACM5p0_MysteryLake
ACM5p7_Ladysmith             → ACM5p7_SoutheyPoint
ACM4p9_MatsquiMain2          → ACM4p9_VedderFault
SCM5p9_Montreal              → SCM5p9_MillesIlesFault
SCM5p6_Ottawa                → SCM5p6_GloucesterFault
ACM5p2_PortAlberni           → ACM5p2_BeaufortFault
ACM7p7_QueenCharlotte        → ACM7p7_QueenCharlotteFault
ACM5p0_Richmond              → ACM5p0_GeorgiaStraitFault
ACM8p0_SkeenaQueenCharlotteE → ACM8p0_QueenCharlotteFault
SCM5p0_Toronto_BTSZ          → SCM5p0_BurlingtonTorontoStructuralZone
SCM5p0_Toronto_CMMBZ         → SCM5p0_RougeBeach
ACM5p5_Tsussie6              → ACM5p5_SoutheyPoint

Actual rename

(very ugly renaming by Anthony (me); should automate more but too sleepy-head to do so...)

sed -E 's/ACM5p2_Abbotsford(_syst)?/ACM5p2_VedderFault/;
        s/ACM5p0_Burnaby(_syst)?/ACM5p0_MysteryLake/;
        s/ACM5p7_Ladysmith(_syst)?/ACM5p7_SoutheyPoint/;
        s/ACM4p9_MatsquiMain2(_syst)?/ACM4p9_VedderFault/;
        s/SCM5p9_Montreal(_syst)?/SCM5p9_MillesIlesFault/;
        s/SCM5p6_Ottawa(_syst)?/SCM5p6_GloucesterFault/;
        s/ACM5p2_PortAlberni(_syst)?/ACM5p2_BeaufortFault/;
        s/ACM7p7_QueenCharlotte(_syst)?/ACM7p7_QueenCharlotteFault/;
        s/ACM5p0_Richmond(_syst)?/ACM5p0_GeorgiaStraitFault/;
        s/ACM8p0_SkeenaQueenCharlotteE(_syst)?/ACM8p0_QueenCharlotteFault/;
        s/SCM5p0_Toronto_BTSZ(_syst)?/SCM5p0_BurlingtonTorontoStructuralZone/;
        s/SCM5p0_Toronto_CMMBZ(_syst)?/SCM5p0_RougeBeach/;
        s/ACM5p5_Tsussie6(_syst)?/ACM5p5_SoutheyPoint/;' \
     -i initializations/*_ACM5p2_Abbotsford_*.ini \
        initializations/*_ACM5p0_Burnaby_*.ini \
        initializations/*_ACM5p7_Ladysmith_*.ini \
        initializations/*_ACM4p9_MatsquiMain2_*.ini \
        initializations/*_SCM5p9_Montreal_*.ini \
        initializations/*_SCM5p6_Ottawa_*.ini \
        initializations/*_ACM5p2_PortAlberni_*.ini \
        initializations/*_ACM7p7_QueenCharlotte_*.ini \
        initializations/*_ACM5p0_Richmond_*.ini \
        initializations/*_ACM8p0_SkeenaQueenCharlotteE_*.ini \
        initializations/*_SCM5p0_Toronto_BTSZ_*.ini

Copied name_change.py over to OpenDRR/earthquake-scenarios temporarily and ran:

./name_change.py ACM5p2_Abbotsford      ACM5p2_VedderFault
./name_change.py ACM5p0_Burnaby ACM5p0_MysteryLake
./name_change.py ACM5p7_Ladysmith       ACM5p7_SoutheyPoint
./name_change.py ACM4p9_MatsquiMain2    ACM4p9_VedderFault
./name_change.py SCM5p9_Montreal        SCM5p9_MillesIlesFault
./name_change.py SCM5p6_Ottawa  SCM5p6_GloucesterFault
./name_change.py ACM5p2_PortAlberni     ACM5p2_BeaufortFault
./name_change.py ACM7p7_QueenCharlotte  ACM7p7_QueenCharlotteFault
./name_change.py ACM5p0_Richmond        ACM5p0_GeorgiaStraitFault
./name_change.py ACM8p0_SkeenaQueenCharlotteE   ACM8p0_QueenCharlotteFault
./name_change.py SCM5p0_Toronto_BTSZ    SCM5p0_BurlingtonTorontoStructuralZone
./name_change.py SCM5p0_Toronto_CMMBZ   SCM5p0_RougeBeach
./name_change.py ACM5p5_Tsussie6        ACM5p5_SoutheyPoint

Probably better if I changed name_change.py to run git mv directly, but it is easily remedied by running git add to let git know of the deleted and added files, which git will figure out to be renames:

git add FINISHED/*.csv initializations/*.ini ruptures/*.xml

(wrong command, shouldn’t have used wildcards, and may need to add -A for --no-ignore-removal too, see below)

This took a while again as git-lfs is likely verifying all the large CSV files. Fortunately, as the contents of the CSV files are unchanged, no extra disk space is needed.

Sorry, the command above was not entirely correctly; the deleted files were not accounted for. Fixed with:

git add -A FINISHED initializations ruptures

I personally had to run the following command to unstage a random file that I had lying around:

git restore --staged FINISHED/geopackages/list.txt

Check number of changed files (should be 13×13=169):

$ git status | egrep ':    (FINISHED|initializations|ruptures)/' | wc
    169     676   22492
$ du -csh .git/lfs/objects/
31G	.git/lfs/objects/
31G	total

Sanity check:

ls -l ruptures/rupture_ACM5p0_GeorgiaStraitFault.xml

Uh oh, file not found! Turns out there is instead ruptures/rupture_ACM5p0_GeorgiaStraitFault_syst.xml... (uh oh...) I forgot to remove _syst from the path (because I either forgot or decided not to use process_scenarios.py).

Quick-and-dirty fix:

for i in $(find ruptures/ FINISHED/ initializations/ -name '*_syst*'); do
    git mv "$i" "${i/_syst/}"
done

which also took 7 minutes over NFS on my computer.

Next, we run process_scenarios.py (written by Damon). This script:

  1. Decompresses FINISHED/*.csv.xz files (if exists; I did it manually with xz -dv FINISHED/*.xz)
  2. Cleans up file names:
    1. Removes _syst from file names.
    2. (For the next step) Populates scenarios array
  3. Run “snapshot” on each scenario, see utilities/snapshot.py
yq -r '.[] | ( .EVENT | sub("rupture_"; "") | sub("_syst.xml"; "") ) + "\t" + ."Alternate Names"' output.yaml

gives:

ACM5p2_Abbotsford	ACM5p2_VedderFault
ACM5p0_Burnaby	ACM5p0_MysteryLake
ACM5p7_Ladysmith	ACM5p7_SoutheyPoint
ACM4p9_MatsquiMain2	ACM4p9_VedderFault
SCM5p9_Montreal	SCM5p9_MillesIlesFault
SCM5p6_Ottawa	SCM5p6_GloucesterFault
ACM5p2_PortAlberni	ACM5p2_BeaufortFault
ACM7p7_QueenCharlotte	ACM7p7_QueenCharlotteFault
ACM5p0_Richmond	ACM5p0_GeorgiaStraitFault
ACM8p0_SkeenaQueenCharlotteE	ACM8p0_QueenCharlotteFault
SCM5p0_Toronto_BTSZ	SCM5p0_BurlingtonTorontoStructuralZone
SCM5p0_Toronto_CMMBZ	SCM5p0_RougeBeach
ACM5p5_Tsussie6	ACM5p5_SoutheyPoint

3. Generate outputs and summaries for GitHub Pages?

Goals

Generate the necessary summary, map preview, etc. for:

  1. Data and Download (?) page for earthquake-scenarios, see https://opendrr.github.io/earthquake-scenarios/en/index.html
  2. For data needed by stack build (TODO: which files exactly?)
  3. For RiskProfiler at https://www.riskprofiler.ca, e.g. FINISHED/scenarios.csv (TODO: check to see how HabitatSeven is using that file, and whether there are other files.)

Steps

  1. Run ../scripts/TakeSnapshot.py for each new scenario, using values from the .csv filenames

    This is to generate the map preview image (.png) and summary file (.md). See commit e79aed6 for an example.

    1. cd into FINISHED directory

    2. run python3 ../scripts/TakeSnapshot.py {scenario name} {EXPO} {HAZ} {DMGb0} {DMGr1} {LOSSb0} {LOSSr1} where:

      • EXPO (b or s) from last letter of scenario .csv file names
      • HAZ from s_shakemap .csv filename
      • DMGb0 from s_dmgbyasset .csv filename following b0_
      • DMGr1 from s_lossesbyasset .csv filename following r1_
      • LOSSb0 from s_lossesbyasset .csv filename following b0_
      • LOSSr1 from s_lossesbyasset .csv filename following r1_

      For example:

      ../scripts/TakeSnapshot.py ACM4p9_GeorgiaStraitFault b 1 2 3 4 5
      ../scripts/TakeSnapshot.py ACM7p4_DenaliFault b 181 182 183 184 185
      ../scripts/TakeSnapshot.py SCM5p5_ConstanceBay b 118 119 120 121 122
      

      Update: As of 2023-04-20, there is now a faster and more automatic method using ../scripts/run-TakeShapshot. Edit the scenarios array therein, and run the script, and commit the new or updated FINISHED/*.{md,png} files.

  2. Run ../scripts/FinishedMap.sh from FINISHED directory

    which updates FinishedScenarios{,Fr}.geojson, FinishedScenarios.md and scenarios.csv files.

  3. Add French translations for added scenarios

    The descriptions are currently translated using DeepL Translator (deepl.com) with the following adjustments:

    • Quotation marks are replaced with guillemets « »
    • U+202F NARROW NO-BREAK SPACE (NNBSP) (espace fine insécable) are used before %, », and after «.
    • Typewriter apostrophes (') are replaced with typographic apostrophes (’)
  4. Calculate or find the map extents, and fill them into appropriate documents:

    • There are 3 sets of map extents for each earthquake scenarios:
      • sauid extent
      • csd extent
      • 5km extent
    • Will (@wkhchow) often calculates them and saves them in the Google Docs named Earthquake_Scenario_Extents
    • Update https://github.com/OpenDRR/riskprofiler/wiki/RiskProfiler-Datasets (started by Joost, updated by Drew, and updated and maintained by Damon)
      • for Anthony: cd ~/NRCan/OpenDRR/riskprofiler.wiki and git log -p RiskProfiler-Datasets.md to see past changes
    • Edit docs/_pages/{en,fr}/dsra_scenario_map.md to fix map extents
    • TODO: Create a new wiki page to document map extents in more detail.
  5. Commit GeoPackage files in FINISHED/geopackages/ directory. Don’t forget scenarios_scenario_extents.zip which is found in Will’s OneDrive, e.g:

    • indicators/3._earthquake_scenarios/scenario_info/shakemap_scenario_extents.zip
    • indicators/3._earthquake_scenarios/13_new_scenarios_Apr2023/scenario_info/shakemap_scenario_extents.zip

    They are automatically uploaded as release assets (.github/workflows/generate_assets.yml) when a new release is published, and made available for download from GitHub Pages.

  6. Run Jekyll locally to preview the GitHub Pages

    Anthony’s port 4000 is taken up by NoMachine, so he runs the following instead:

    bundle exec jekyll serve --trace --baseurl '' --port 4001
    
  7. Upload/push the resulting changes to GitHub. You can push to your own fork (e.g. https://anthonyfok.github.io/earthquake-scenarios/)

  8. Create a pull request

  9. After the pull request has been reviewed and merged, create/tag a new release. This will trigger the upload of release assets which are referenced by the final GitHub Pages

  10. Check https://opendrr.github.io/earthquake-scenarios/ to make sure everything is OK.

    • Check if the GeoPackage files are downloadable (make sure the release version, say, v1.4.4, is in the URL)
    • Click on each map and check if it is showing the correct location (map extent) with correct shakemap data (vector tiles)
    • TODO: more checks?

4. “Merge” Pull Request, upload release assets, check GitHub Pages

  • Create/update CHANGELOG.md especially when there are renames that would introduce breaking changes.
  • “Merge” pull request
    • Anthony personally prefers doing it from the command-line to avoid merge bubbles, e.g. git push origin feature-branch:master
  • Draft a new release, creating a new tag (e.g. v1.2.1), and publish it.
  • Check to make sure the release assets are uploaded correctly by the “Upload release assets” GitHub Actions workflow.
  • Check number of assets
    • e.g. Why are there 197 assets for v1.2.0 and v1.2.1?
      • 9 scenarios, each normally with 7 CSV files and 14 zipped GeoPackage files: 9×(7+14) = 189
      • 2 extra (test?) ShakeMap MMI (Modified Mercalli Intensity) CSV files, see scripts/convert_to_MMI.py:
        • s_shakemap_ACM7p0_GeorgiaStraitFault_124_MMI.csv
        • s_shakemap_ACM7p3_LeechRiverFullFault_107_MMI.csv
      • 2 Excel spreadsheet files:
        • dsra_attributes_en.xlsx
        • dsra_attributes_fr.xlsx
      • 1 extra CSV file: scenarios.csv
      • 1 extra ZIP file: shakemap_scenario_extents.zip
      • 2 Source code archives (.tar.gz, .zip)

5. Do the “Stack build” at OpenDRR/opendrr-api

TODO

  1. python/build_exposure_ancillary.sh
  2. python/add_data.sh

Where necessary:

  • Update scripts in OpenDRR/model-factory
    • Create/modify SQL and Python scripts in the OpenDRR/model-factory repository where necessary for the new DSRA scenarios.
  • Update/generate pygeoapi config file

7. Import to Elasticsearch

TODO

This is part of the “stack build”

8. Export Elasticsearch indices using Elasticdump

TODO

9. Generate GeoPackage files

Currently, the GeoPackages are generated after a successful stack build on a windows environment a batch file. The batch file relies on the system having installed below:

The batch file uses ogr2ogr to convert the tables from the PostgreSQL database into GeoPackages and 7zip to compress the GeoPackages afterward. The names of the GeoPackages are based on the same names in the PostgreSQL tables.

A sample bat script for for the latest 13 scenario DSRA outputs can be found in here.

Sample snippet from the script above for indicators_s GeoPackages:

REM Geopackage dsra, _indicators_s
FOR %%x IN (dsra_acm4p9_vedderfault, ^
dsra_acm5p0_georgiastraitfault, ^
dsra_acm5p0_mysterylake, ^
dsra_acm5p2_beaufortfault, ^
dsra_acm5p2_vedderfault, ^
dsra_acm5p5_southeypoint, ^
dsra_acm5p7_southeypoint, ^
dsra_acm7p7_queencharlottefault, ^
dsra_acm8p0_queencharlottefault, ^
dsra_scm5p0_burlingtontorontostructuralzone, ^
dsra_scm5p0_rougebeach, ^
dsra_scm5p6_gloucesterfault, ^
dsra_scm5p9_millesilesfault) DO ogr2ogr -f "gpkg" 	D:\Workspace\data\view_outputs\all_indicators\earthquake_scenarios\%%x_indicators_s.gpkg PG:"host=localhost user=postgres dbname=opendrr password=admin port=5432" -sql "SELECT * FROM results_%%x.%%x_indicators_s" -nln %%x_indicators_s

REM earthquake scenarios
CD /D "D:\Workspace\data\view_outputs\all_indicators\earthquake_scenarios\" && FOR %%i IN (*.*)	DO 7z.exe a "%%~ni.zip" "%%i"
DEL *.gpkg

The user can adjust the script to include which dsra scenarios to be exported {eqScenario}, where to export the GeoPackages {drive}, {Directory Path} and database information (user=postgres dbname=opendrr password=admin port=5432) by modifying the script accordingly.

REM Geopackage dsra, _indicators_s
FOR %%x IN ({eqScenario}) DO ogr2ogr -f "gpkg" {Directory Path}\%%x_indicators_s.gpkg PG:"host=localhost user=postgres dbname=opendrr password=admin port=5432" -sql "SELECT * FROM results_%%x.%%x_indicators_s" -nln %%x_indicators_s

REM earthquake scenarios
	CD /{drive} "{Directory Path}\" && FOR %%i IN (*.*)	DO 7z.exe a "%%~ni.zip" "%%i"
DEL *.gpkg

In the script, the following outputs are generated per scenario:

  • {eqScenario}_indicators_b
    • building level indicators
  • {eqScenario}_indicators_s
    • sauid level indicators
  • {eqScenario}_indicators_csd
    • census subdivision level indicators
  • {eqScenario}_indicators_shakemap
    • shakemap points
  • {eqScenario}_indicators_shakemap_hexgrid_1km
    • shakemap point max values attributed to a EPSG 3857 1km hexgrid with shorelines clipped
  • {eqScenario}_indicators_shakemap_hexgrid_1km_uc
    • shakemap point max values attributed to a EPSG 3857 1km hexgrid with shorelines unclipped
  • {eqScenario}_indicators_shakemap_hexgrid_5km
    • shakemap point max values attributed to a EPSG 3857 5km hexgrid with shorelines clipped
  • {eqScenario}_indicators_shakemap_hexgrid_5km_uc
    • shakemap point max values attributed to a EPSG 3857 5km hexgrid with shorelines unclipped
  • {eqScenario}_indicators_shakemap_hexgrid_10km
    • shakemap point max values attributed to a EPSG 3857 10km hexgrid with shorelines clipped
  • {eqScenario}_indicators_shakemap_hexgrid_10km_uc
    • shakemap point max values attributed to a EPSG 3857 10km hexgrid with shorelines unclipped
  • {eqScenario}_indicators_shakemap_hexgrid_25km
    • shakemap point max values attributed to a EPSG 3857 25km hexgrid with shorelines clipped
  • {eqScenario}_indicators_shakemap_hexgrid_25km_uc
    • shakemap point max values attributed to a EPSG 3857 25km hexgrid with shorelines unclipped
  • {eqScenario}_indicators_shakemap_hexgrid_50km_uc
    • shakemap point max values attributed to a EPSG 3857 50km hexgrid with shorelines unclipped
  • {eqScenario}_indicators_shakemap_hexgrid_100km_uc
    • shakemap point max values attributed to a EPSG 3857 100km hexgrid with shorelines unclipped
  • shakemap_scenario_extents
    • table showing extents (polygon) of each scenario in the PostgreSQL database

10. Generate vector tiles

The vector tiles are generated from the GeoPackages using:

For RiskProfiler purposes only the following GeoPackages are required to create the vector tiles:

  • {eqScenario}_indicators_s
  • {eqScenario}_indicators_csd
  • {eqScenario}_indicators_shakemap_hexgrid_1km
  • {eqScenario}_indicators_shakemap_hexgrid_5km
  • {eqScenario}_indicators_shakemap_hexgrid_25km

Each set of geopackage needs 2 sets of vectors tiles in EPSG 4326 and EPSG 900913. In total for one {eqScenario}, there would be a total of 10 vector tile datasets that are created.

Steps for running GeoServer

  1. Install GeoServer and Vector Tiles extension (see Notes).

  2. Start GeoServer and Open GeoServer Web Portal. Login to Geoserver (default credentials are admin/geoserver).

  3. In Tile Caching section => Blobstores => Add new

    • Type of BlobStore: File BlobStore, Identifier: opendrr
    • Make sure Enabled/Default are checked
    • Base Directory: Whatever directory you want the tiles to be stored i.e. C:\GeoServer_Tiles\
    • Tiles directory layout: SLIPPY
  4. In Data section => Stores => Add new Store

    • Vector Data Sources => GeoPackage
    • Make sure Workspace is set to the default BlobStore created in step 3, in Connection Parameters browse and select the GeoPackage. Under Data Source Name, add the name of the GeoPackage i.e. {eqScenario}_indicators_s and save.
  5. In New Layer, hit Publish on the Layer name that was created in step 4. Scroll down to Bounding Boxes, click Compute from data and Compute from native bounds to get the bounding box extents. Scroll back up to the top and click on Tile Caching tab, ensure metatiling factors are set to 1 tiles wide by 1 tiles high. In Tile Image Format section, check application/vnd.mapbox-vector-tile and uncheck image/jpeg and image/png. Leave all other items default and click Save.

  6. In Tile Caching section => Tile Layers - Select seed/Truncate for the layer you want to create the layers

    • Number of tasks to use: Select the number of threads you want this process to use (see Notes)
    • EPSG: 4326
    • Format: application/vnd.mapbox-vector-tile
    • Zoom stop: (see notes)
    • Repeat the step for EPSG 900913
  7. Repeat steps 4 to 7 for all the GeoPackage.

  8. Once complete, all the tiles will be located in the default path specified in step 3 i.e. C:\GeoServer_Tiles. The tiles can then be compressed and uploaded.

  9. Check with Anthony on the steps after.

Notes

  • Step 1

    • -Extension installation mentions to copy all extracted contents to the WEB-INF/lib which can be hard to find. The precise default path in windows is
    • C:\Program Files\GeoServer\webapps\geoserver\WEB-INF\lib

  • Step 6

    • Generating the vector tiles requires writing of millions of files, and depending on the size of the scenario, can take hours to generate. A faster computer with adequate CPU, ram, and storage is recommended when generating the tiles. Currently the tiles are generated using an AWS EC2 instance (16 vCPUs, 32gb ram, 200gb storage using c6i.4xlarge tier).
    • Intel CPUs has better performance over AMD
    • GeoServer supports up to 16 threads at a time, i.e. you can run several processes with 8 threads each and it will run both. If you add another process it will just queue up until one of the processes are completed.
    • Zoom stop levels for each GeoPackage:
      • _s, _csd, _1km, _5km = Zoom stop: 14
      • 25km = Zoom stop: 12
  • General

    • While generating the tiles, the Geoserver log (default location C:\Program Files\GeoServer\logs) file grows quite large, it can be safely erased once all the tiles are finished processing. The log file will be recreated once GeoServer starts again.

11. Add records to FGP/open.canada.ca

Every generated scenario needs to include a metadata record on the internal GoC only FGP (Federal Geospatial Platform). Once approved by a reviewer, the record is released to the public and becomes searchable through the Open Canada site and GEO.ca sites.

Steps:

  • Log in to FGP or create an account if you don't have one.
  • In the search bar, enter keywords such as 'Canada's National Earthquake Scenario Catalogue' and click on the first record. This will take you to the scenario catalog page, where each earthquake scenario is linked.
  • Choose any scenario to load into the record. For convenience, it's best to use an existing scenario, save it as a new record, and then edit the new record to add the details of the new scenario. To do this, select 'Save as' from the dropdown menu on the right and click Apply. Edit the record with the new scenario details in English/French where applicable.
    • Obtain Geographic Bounding Box coordinates by loading the sauid indicators in QGIS and identifying the extents.
    • For thumbnails, take a snapshot of the scenario from the ArcGIS online map viewer once the ESRI REST services are set up and available for the scenario.
    • Update Map and Data Resources as needed, similar to other scenarios. This usually includes ESRI REST, WMS, GeoPackage (points, polygons, shakemap), data dictionary, Openfile, and GitHub Repository links.

ESRI REST services need to be set up by individuals at FGP, requiring some preparation before initiating the process. Templates have been created and are available on GCDocs to assist with the preparation.

What's included:

  • dsra_prep.tbx:

    • FGP prefers data in file geodatabase format, and projected to their standard (NAD_1983_Canada_Atlas_Lambert). This tool assists in converting GeoPackage outputs (shakemap hexgrid at 1km, 5km, and 25km) into a new file geodatabase and reproject the data. Follow the toolbox instructions for successful execution.
  • dsra_en_10p8_template.mxd and dsra_fr_10p8_template.mxd:

    • ArcMap 10.8 scenario templates with all necessary layers and pre-defined symbology. Replace layer and description names with the correct magnitude (M{#}), scenario name ({Scenario}), and province/territory ({P/T}). The French version has official translations in the French MXD.
    • Layers need fixing and updating to the correct source. Once done, save the map documents in 10.7 format as FGP operates on ArcMap 10.7. (File -> Save A Copy -> Save as type -> ArcMap 10.7 Document)
  • Scenario Shakemap Hexbin 1km, 5km, and 25km layer files:

    • These contain symbology for the shakemap and are preloaded into the MXD templates.

Once all preparations are complete, upload the 10.7 MXDs (en, fr) and associated data to SharePoint (internal), then contact FGP to initiate the process.

12. Add the new DSRA scenarios to WordPress (by HabitatSeven initially)

TODO

13. Deploy to AWS for RiskProfiler, Kibana dashboard, etc. (Arash)

TODO

See "Riskprofiler Deployment Process - v2.docx" by Arash Malekzadeh (GCDocs)


Questions/TODO

  1. What are all the different types of GeoPackage files, where are they, and what are they for?

    • OpenDRR/boundaries: *.7z, hexgrid_3857/*.7z, hexgrid_4326/*.7z
    • OpenDRR/national-human-settlement: physical-exposure/data/*.zip, social-fabric/data/*.zip
    • OpenDRR/seismic-risk-model: data/national/psra_*.zip, data/province/??/psra_*.zip
    • OpenDRR/earthquake-scenarios: FINISHED/geopackages/*.zip

    In Will’s “indicators” backup:

    • 1._national_human_settlement_layers
    • 2._seismic_risk
    • 3._earthquake_scenario_risk
  2. Measure how much RAM is needed by 7-Zip to decompress some of the larger *.7z files. (Anthony’s computer started swapping heavily when trying to test)

  3. List actual commits as examples.

Why this document

This idea was first proposed as an OpenDRR administrative task during Sprint 59 on May 30, 2022, especially as we prepare for an anticipated tenfold increase in the DSRA scenarios in the future.

References