Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merge pip modularization changes to "end of may work in progress" #16

Open
wants to merge 31 commits into
base: jbloedow/end_of_may_wip
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
31 commits
Select commit Hold shift + click to select a range
4554daf
Converting to pip installable package.
Jun 19, 2024
b5298c1
Tidy up data files that were moved into the module src dir. Moved the…
Jun 19, 2024
7b97db0
Don't want large data files in repo; can get from idm-data.
Jun 19, 2024
8d71f4e
Added some startup checks for existence of settings files and useful …
Jun 20, 2024
1ec45fe
Added build_template_workspace script to new package and fixed issues…
Jun 26, 2024
09190d5
Adding README for pip package.
Jul 1, 2024
b4a6471
Added section to README to show all the code references for an exampl…
Jul 1, 2024
1dfd8d5
Tweaking README for pip package.
Jul 1, 2024
c2aaf1b
Removed some placeholder doc string from cli.py. Should really add mo…
Jul 2, 2024
adf8c4f
Removed some placeholder doc string from __main__.py. Should really a…
Jul 2, 2024
2eae9c5
Fixes and changes to support running in COMPS.
Jul 2, 2024
5553728
Adding script and SIF asset collection id to run a version of laser o…
Jul 3, 2024
be3e19a
Added ability to get pre-created fits_ew.npy from idm-data.
Jul 4, 2024
7ec9e1e
Got pip dev install working.
Jul 5, 2024
23c55cb
Make fits.npy file optional. User can model without eulas if they wan…
Jul 6, 2024
074cc37
1) Fixed bug in how incubators were being counted; 2) Removed current…
Jul 9, 2024
adaa9fb
Relatively significant change. Counting exposed/incubating in collect…
Jul 9, 2024
07ed7fc
Tweaked default base_infectivity and seasonal_multiplier after recali…
Jul 9, 2024
66d7d7a
Rev to 0.0.7.
Jul 10, 2024
a028a48
Added missing MANIFEST.in file
Jul 10, 2024
28c9619
Code wasn't use dont_import_after setting. And the import_cases setti…
Jul 10, 2024
0404380
Removed duplicate get_beta_samples function from sir_numpy.py. Using …
Jul 10, 2024
9ce6c12
Removed unused code from the all sql model. And an unused function in…
Jul 10, 2024
4e9c3eb
Added capabliity to set incubation_period in settings.py. This is use…
Jul 10, 2024
102ec92
Adding scripts & files to run idmlaser on COMPS.
Jul 13, 2024
a8f3e03
Some cleanup.
Jul 13, 2024
19bba72
Cleaned up some doc strings with ChatGPT.
Jul 13, 2024
f598b87
Don't treat settings.py not being in INPUT_ROOT directory as an error…
Jul 16, 2024
fe95682
Was using wrong datatype.
Jul 22, 2024
bebf4e7
Removed more currently unused code from update_ages.cpp and revved ve…
Jul 22, 2024
dd5c258
Got .so building and installing working with both wheel and develop i…
Jul 24, 2024
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
6 changes: 6 additions & 0 deletions jb/MANIFEST.in
Original file line number Diff line number Diff line change
@@ -0,0 +1,6 @@
include src/idmlaser/model_numpy/*.py
include src/idmlaser/model_sql/*.py
include src/idmlaser/update_ages.cpp
include src/idmlaser/makefile
include src/idmlaser/examples/QuickStart.ipynb
include src/idmlaser/schema.json
166 changes: 166 additions & 0 deletions jb/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,166 @@
# Welcome to IDMLaser

## Installation
```
pip3 install idmlaser
```

## QuickStart Setup

```
python3 -m idmlaser.utils.build_template_workspace
```

This will prompt you to specify a path to a sandbox directory to create a new workspace.
```
Enter the sandbox directory path (default: /var/tmp/sandbox):
```
It will also prompt you if the sandbox directory already exists because it will wipe it clean if you consent. E.g.,
```
The directory '/var/tmp/sbox' already exists. Do you want to overwrite it? (yes/no):
```
And it will ask if you want to run the pre-configured England & Wales spatial scenario or a "Critical Community Size" synthetic scenario. You will see some console output as files are copied and/or downloaded and some initialization is done. Either of these should just work. The CCS is the simplest and fastest and recommended for first time users.

You can then change directory to the sandbox or workspace (use ```cd``` or ```pushd```) and run the model.

### Run Model

At the most basic level, to run the model, you do:

```
python3 -m idmlaser.measles
```

You should see some console output such as "T=<N>" for all the timesteps up to 7300 (20 years) as well as sparkline output for each timestep.

*Output*
The simulation should produce a ```simulation_output.csv``` report file. It consists of the following columns:
```
Timestep,Node,Susceptible,Infected,New_Infections,Recovered,Births
```

You can use any solution of your choice to plot the Infected, Susceptible, Recovered or New_Infections channels by Node over Time. If you're running the CCS demo, there should only be 1 node (node id=0). If you're running the E&W demo, it's most instructive at first to plot output for node 507 (London) or Birmingham (99).

## Examples

### Example 1: England & Wales

The E&W example is the classic "Measles in England and Wales during the post-war Period" dataset. It consists of 954 locations with location-specific birth rates. The model input files are downloaded during setup and do not not need to be regenerated. We put everyone over age 5 into an initial EULA bucket and model only the under 5s. The input modeled population (agents) file has size 495644.

### Example 2: CCS (1 node)

The CCS 1 node example is the simplest example since it's not spatial. The total initial population is 2.4 million, again with the EULA age set at 5. This results in 118,722 agents initially being modeled. The duration is also 20 years. Birth rate is set at 17.5 (CBR).

### Example 3: CCS (100 nodes)

This example is achieved by starting with CCS 1 node and editing the demographics_settings.py file to change the total population to something like 1e7 and set the num_nodes to 100. Then type ```make``` to regenerate the model input files.

## Workflow

- Create Input Files

Modify demographics_settings.py. Set the pop(ulation) and num_nodes, and maybe eula_age.

```
make
```

- Edit Settings

See below section on parameters in settings.py for what you can change here.

- Run Model
```
python3 -m idmlaser.measles
```


## Input Files

The main disease model will look for at least two files:
- settings.py
- demographics_settings.py

These are both simple files with Python-compatible key-value pairs. Let's start with demograhics_settings.py. There is an example in examples/demographics_settings.py.
- pop_file: filename of compressed csv with all the agents to be modeled. Columns are attributes. Rows are agents.
- eula_pop_fits: filename of npy file which is the slope and intercept of the eula population over time by node.
- cbr_file: filename of csv file with the crude birth rates of each node by year.

- nodes: an array (list) with the all the node ids. Could be as simple as '[0]'
- num_nodes: The total number of nodes. (Could be inferred from pop_file or even nodes.)
- eula_age: The EULA age threshold. EULA=Epidemiologically Uninteresting Light Agents. There is nobody older than this in the original modeled population. This is used as an input for the model pre-proc step.


The settings.py file consists of:
- duration: Simulation duration in days.
- base_infectivity
- seasonal_multiplier: Scalar to apply to annual seasonality multiplier curve
- infectivity_multiplier: Array of multipliers representing seasonality.

*Reporting*
- report_filename: Defaults to ="simulation_output.csv"
- report_start: When to start reporting

*Burnin*
- burnin_delay: Delay from start to wait before injecting cases
- import_cases: not used
- dont_import_after: Time after which to stop importing


*Runtime Demographics*
- cbr: Crude Birth Rate. Set this to -1 to use cbr by year and node via cbr_file.
- mortality_interval: Timesteps between applying natural mortality (can remove)
- fertility_interval: Timesteps between adding babies (can probably remove)

*Migration*
- attraction_probs_file: csv file with probabilities of agent traveling from node A to node B.
- migration_fraction: Fraction of infected people to migrate to another node each migration.
- migration_interval: Days between migrations


*Interventions (experimental)*
- campaign_day: Day to launch test SIA campaign
- campaign_coverage: Coverage for test SIA campaign
- campaign_node: Node for test SIA campaign
- ria_interval: Days between RIA distributions

## Model Behavior

### Default Model Behavior

The model behavior is essentially defined by the properties and step functions. The properties that are currently hardcoded in this package are:

```
['id', 'node', 'age', 'infected', 'infection_timer', 'incubation_timer', 'immunity', 'immunity_timer', 'expected_lifespan' ]
```

These are ultimately controlled by the code in the idmlaser.utils.create_pop_as_csv tool and seen as the columns of the modeled_pop.csv.gz files. For example, in this particular implmentation, each agent has an "age" which turns out to be in units of years. There is code in "update_ages.cpp" which assumes the existence of an age column. Most obviously the "update_ages" function itself, which ages people by 1 day each day, but other step functions in update_ages.cpp may check the agent's age. Each agent has an infected boolean flag and an immunity flag. Those each have countdown timers. The immunity_timer can be set to -1 for permanent immunity. A positive countdown timer gets counted down each timestep by 1 and the flag gets set to False/0 when the timer reaches 0. That's not a fundamental decision of this design, just what is implemented in the code right now.

### New Model Behavior
One can completely redesign the behavior of the model. To modify the model behavior you can:
- Add/remove properties.
- Add/remove/modify code in update_ages.cpp (and recompile).
- Add/remove/modify glue code in sir_numpy_c.

If, say, you wanted to model agents ages but use a fixed date-of-birth (dob) and caclulate age on-the-fly by comparing to "now", you would need to create the model with a dob column (and assign values during initialization and at birth) and also modify the step functions.

I have made no attempt up to this point to create an infrastructure that is 100% agnostic or dynamic on model attributes (columns). Making the code more generic and abstracted will also make it a bit more complex. "There are no solutions, only tradeoffs".

Let's consider all the places that the code currently "knows" that there is an "age" column, i.e., where it's hardcoded and would need to be changed if age was done differently:

Init:
- [Model dataframe initialization](https://github.com/InstituteforDiseaseModeling/laser/blob/jb_modulify/jb/src/idmlaser/utils/create_pop_as_csv.py#L27)
- [Loading model dataframe into np array](https://github.com/InstituteforDiseaseModeling/laser/blob/jb_modulify/jb/src/idmlaser/sir_numpy.py#L49)
- [Adding 'expansion slots'](https://github.com/InstituteforDiseaseModeling/laser/blob/jb_modulify/jb/src/idmlaser/sir_numpy_c.py#L200)

Stepwise:
- Age everyone (already born): [py](https://github.com/InstituteforDiseaseModeling/laser/blob/jb_modulify/jb/src/idmlaser/sir_numpy_c.py#L300) and [C](https://github.com/InstituteforDiseaseModeling/laser/blob/jb_modulify/jb/src/idmlaser/update_ages.cpp#L55)
- Make newborns: [py](https://github.com/InstituteforDiseaseModeling/laser/blob/jb_modulify/jb/src/idmlaser/sir_numpy_c.py#L330) and [C](https://github.com/InstituteforDiseaseModeling/laser/blob/jb_modulify/jb/src/idmlaser/update_ages.cpp#L529)
- RIA: [py](https://github.com/InstituteforDiseaseModeling/laser/blob/jb_modulify/jb/src/idmlaser/sir_numpy_c.py#L524) and [C](https://github.com/InstituteforDiseaseModeling/laser/blob/jb_modulify/jb/src/idmlaser/update_ages.cpp#L473)
- SIA: [py](https://github.com/InstituteforDiseaseModeling/laser/blob/jb_modulify/jb/src/idmlaser/sir_numpy_c.py#L537) and [C](https://github.com/InstituteforDiseaseModeling/laser/blob/jb_modulify/jb/src/idmlaser/update_ages.cpp#L439)
- Collect Report: [py](https://github.com/InstituteforDiseaseModeling/laser/blob/jb_modulify/jb/src/idmlaser/sir_numpy_c.py#L264) and [C](https://github.com/InstituteforDiseaseModeling/laser/blob/jb_modulify/jb/src/idmlaser/update_ages.cpp#L295)

Each of the "stepwise" functions also have a argtype declaration at the top of sir_numpy_c.py which is aware of the age column.

I shall not repeat that for each of the other attributes/properties (e.g., infected, incubation_timer).
##
30 changes: 30 additions & 0 deletions jb/R_scripts/client.R
Original file line number Diff line number Diff line change
@@ -0,0 +1,30 @@
library(httr)
library(jsonlite)

# Define the URL and the payload
url <- "http://172.18.0.2:5000/submit"
payload <- list(
base_infectivity = 0.5,
migration_fraction = 0.1,
seasonal_multiplier = 1.2,
duration = 2
)

# Convert the payload to JSON
json_payload <- toJSON(payload, auto_unbox = TRUE)

# Perform the POST request
response <- POST(
url,
add_headers("Content-Type" = "application/json"),
body = json_payload,
encode = "json"
)

# Check the response
if (status_code(response) == 200) {
print(content(response, "text"))
} else {
print(paste("Request failed with status:", status_code(response)))
}

File renamed without changes.
60 changes: 60 additions & 0 deletions jb/R_scripts/create_pop_as_csv_lean.R
Original file line number Diff line number Diff line change
@@ -0,0 +1,60 @@
library(RSQLite)
library(dplyr)
library(data.table)

settings <- new.env()
source("demographics_settings.R", local = settings)
pop_below_eula <- as.integer(0.05*settings$pop)

# Format the value of settings$pop_below_eula before printing
# pop_below_eula_formatted <- as.character(settings$pop_below_eula)

# Print the formatted value of settings$pop_below_eula
pop_below_eula # print

source( "utils/get_rand_lifespan.R" )
get_rand_lifespan()

#node_assignments <- sample(0:(settings$num_nodes - 1), settings$pop, replace = TRUE)

expected_lifespans <- numeric(pop_below_eula)

# Populate the expected lifespans vector using get_rand_lifespan() for each row
for (i in 1:pop_below_eula) {
expected_lifespans[i] <- get_rand_lifespan()
}

get_node_ids <- function() {
# Generate the array based on the specified conditions
node_list <- lapply(settings$nodes, function(node) {
rep(node, node + 1)
})

array <- unlist(node_list)

# Repeat the array to match the population size
repeats <- ceiling(pop_below_eula / length(array))
array <- rep(array, times = repeats)[1:pop_below_eula]

# Shuffle the array to randomize the order
array <- sample(array)

# Convert the array to integers
array <- as.integer(array)

# Print the first few elements as an example
# print(head(array, 20))

return(array)
}

individual_data <- data.frame(
node = get_node_ids(), # sample(0:100, pop_below_eula, replace = TRUE), # rep(node_assignments, each = pop_below_eula),
age = runif(pop_below_eula, min = 0, max = settings$eula_age) + runif(pop_below_eula, min = 0, max = 365) / 365, # runif(pop_below_eula, min = 0, max = 100), #
infected = rep(FALSE, pop_below_eula),
infection_timer = rep(0, pop_below_eula),
incubation_timer = rep(0, pop_below_eula),
immunity = rep(FALSE, pop_below_eula),
immunity_timer = rep(0, pop_below_eula),
expected_lifespan = expected_lifespans
)
14 changes: 14 additions & 0 deletions jb/R_scripts/demographics_settings.R
Original file line number Diff line number Diff line change
@@ -0,0 +1,14 @@
# Population and Nodes if building population from parameters
pop <- as.integer(2.4e6) + 1
num_nodes <- 2
nodes <- seq(0, num_nodes - 1)

# Epidemiologically Useless Light Agents (EULA) Age
eula_age <- 5

# Filenames if loading population from file
pop_file <- "modeled_pop.csv.gz"
eula_file <- "eula_binned.csv"
eula_pop_fits <- "fits.npy"
cbr_file <- "cbrs.csv"

File renamed without changes.
File renamed without changes.
File renamed without changes.
29 changes: 29 additions & 0 deletions jb/client/laser.def
Original file line number Diff line number Diff line change
@@ -0,0 +1,29 @@
Bootstrap: docker
From: rockylinux:9

%post
dnf -y install python3-pip
dnf -y install gcc-c++
dnf -y install sudo
dnf -y install epel-release
dnf clean all

python3 -m pip install pip --upgrade
python3 -m pip install idmlaser -i https://packages.idmod.org/api/pypi/pypi-production/simple

%runscript


%environment
export INPUT_ROOT=Assets
export HEADLESS=1

%test


%labels
Author jonathan.bloedow@gatesfoundation.org

%help
Container for running LASER prototype on COMPS

Loading
Loading