This template is a scaffold for an EpiModeHIV applied project. Before reading on make sure you have read the getting started wiki page and done all the relevant steps.
From now on, we refer to the applied project as applied_proj
and to your
custom EpiModeHIV-p
branch as EpiModeHIV-p@applied_proj
.
At this point, we assume that you have your applied project cloned on your local
computer and your EpiModeHIV-p
branch checked out as well.
This template is divided into several steps. They are separated as
sub-directories under the R/
folder. They each contain a README.md
file. We
will describe each of them below.
Each script in this project is structured in the same way to simplify reading. As these applied project are pretty complex, you are advised to read the code and make sure you understand what it does before running it. Comments are often present to guide you along the way.
By top level scripts we mean the scripts that will be executed directly by the user.
They are all the scripts starting with as number (e.g. 1-estimation.R
) or the
scripts starting with workflow
(e.g. workflow-networks.R
).
These scripts are meant to be run in clean R session. It is advised to restart
R before running these scripts. This can be done in RStudio by pressing
Ctrl+Shift+F10
on Windows or Cmd_Shift_0
on MacOSX. (Note, .rs.restartR()
is NOT the same at all).
All other scripts are utilities. They provide variables or functions to the top level ones and should not be run on their own.
Each step contains a z-context.R
. It defines specific parameters differently
depending on the context of executions.
The two possible contexts are local
or hpc
. Local means your own computer
and hpc is the High Performance Computing cluster where you will run your
large scale simulations.
You will probably not need to edit these files.
Simply know that the context switching is done by setting the following variable before these scripts are sourced:
hpc_context <- TRUE
When creating a new top level script you should adhere to the global structure used all over this repo. These project of ours are very complex with many moving pieces. Trying to keep them as clean as possible helps a lot in not getting lost.
The data used and produced by the project are stored in the data
directory.
It contains 3 sub-directories:
input/
: data required to make the project from scratch.output/
: data used for publication (tables, plot)run/
: temporary data produced by the project that become useless once the paper is published (raw simulations, estimated models).
Only the data/input/
directory is managed with git. The rest is ignored and
should never end up on github.
First of, some scripts will need a few modification to fit your project.
This files contains the generic configuration for the project. It will be sourced by every top level scripts.
Open it now and modify the following variables to fit your project:
EMHIVp_branch <- "applied_proj" # the name of your project
EMHIVp_dir <- "~/../Desktop/GitHub/EpiModelHIV-p" # local clone of EpiModelHIV-p
time_unit <- 7 # number of days in a time step (7 for weekly)
This scripts contains configuration for running things on the HPC.
Open it now and modify the mail_user
variable to reflect your own e-mail
address. It will be used to notify you of the progress of your jobs on the HPC.
The current_git_branch
variable should be left to main
most of the time.
Modify it only if you created an new branch and want to run code from it on the
HPC.
If you are working with the RSPH HPC, leave the rest unchanged. Otherwise modify the code accordingly.
This script defines the default settings for netsim
. The default values are
probably correct. Modify the init
variable to be init <- init_msm()
if you
need to disable the STIs in your model.
This file is for you to test code without making another script dirty. As a reminder, you should always write code in a file and not in the R console. Even if it's just for a quick test.
Go to the 00-setup.R
file.
Run the renv::init(bare = TRUE)
line and restart the R session before
carrying on.
Run the rest of the script. It will install all the necessary packages.
At this point you can go the README.md
file for step A-networks
Below is a list of all the steps with a quick description.
- A-networks: Estimate and diagnosed the network models.
- B-netsim_explore: Get familiar with running network models with
netsim
- C-netsim_scenarios: Run network models with the scenario API
- D-restart_point: Mandatory non interactive step
- E-intervention_explore: Get familiar with restarting network models with
netsim
- F-intervention_scenarios: Run intervention scenarios and process results for publication
- Z-calibration: advanced step addressing calibration
Here is a list of commonly made mistakes to help you avoid them as you go:
As you start working with the HPC, you will create workflow directories. Do not forget to delete them locally AND on the HPC before making a new one with the same name.
On the HPC this can be done with:
rm -rf workflows/<the name of your workflow>