-
Notifications
You must be signed in to change notification settings - Fork 4
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Feature request]: Add Output Plotting Options Of Panel Figure With Main Scenario Hub Targets #415
Labels
enhancement
Request for improvement or addition of new feature(s).
gempyor
Concerns the Python core.
medium priority
Medium priority.
plotting
Relating to plotting and/or visualizations.
post-processing
Concern the post-processing.
Milestone
Comments
TimothyWillard
added
enhancement
Request for improvement or addition of new feature(s).
gempyor
Concerns the Python core.
post-processing
Concern the post-processing.
medium priority
Medium priority.
plotting
Relating to plotting and/or visualizations.
labels
Dec 9, 2024
MacdonaldJoshuaCaleb
changed the title
[Feature request]: Add Output Plotting Option Of Panel Figure With Main Scenario Hub Targets
[Feature request]: Add Utilities for Scenario Hub evaluation and submission
Dec 17, 2024
MacdonaldJoshuaCaleb
changed the title
[Feature request]: Add Utilities for Scenario Hub evaluation and submission
[Feature request]: Add Plotting Utilities for Scenario Hub targets
Dec 17, 2024
TimothyWillard
changed the title
[Feature request]: Add Plotting Utilities for Scenario Hub targets
[Feature request]: Add Output Plotting Option Of Panel Figure With Main Scenario Hub Targets
Dec 17, 2024
MacdonaldJoshuaCaleb
changed the title
[Feature request]: Add Output Plotting Option Of Panel Figure With Main Scenario Hub Targets
[Feature request]: Add Output Plottings Option Of Panel Figure With Main Scenario Hub Targets
Dec 18, 2024
MacdonaldJoshuaCaleb
changed the title
[Feature request]: Add Output Plottings Option Of Panel Figure With Main Scenario Hub Targets
[Feature request]: Add Output Plotting Options Of Panel Figure With Main Scenario Hub Targets
Dec 18, 2024
Here's some code for making time resolved confidence interval plots per @shauntruelove's request on slack. Note formatted is the output as an SMH formatted data frame (see #430 for details) from datetime import timedelta
def get_week_number(date_obj, start_date):
return (date_obj - start_date).days // 7
def get_saturday_date(week_number, start_date):
return start_date + timedelta(days=(week_number * 7) + 5)
# Find the global maximum value across all scenarios
def global_max(loc, formatted):
maxes = []
for scenario in formatted['scenario_id'].unique():
data = formatted[(formatted['age_group'] == '0-130') & (formatted['location'] == loc) & (formatted['scenario_id'] == scenario)]
pivoted_data = data.pivot(index='run_grouping', columns='horizon', values='value')
maxes.append(pivoted_data.quantile(0.975).max())
return np.max(maxes)
def get_max_season(loc):
data_hosp = pd.read_csv(r'~/Documents/weekly_flu_incid_complete_fixed.csv')
data_hosp = data_hosp[(data_hosp['season'] == '2022-23') | (data_hosp['season'] == '2023-24')]
states = ['Alabama','Alaska','Arizona','Arkansas','California','Colorado','Connecticut','Delaware','District of Columbia',
'Florida','Georgia','Hawaii','Idaho','Illinois','Indiana','Iowa','Kansas','Kentucky','Louisiana','Maine','Maryland',
'Massachusetts','Michigan','Minnesota','Mississippi','Missouri','Montana','Nebraska','Nevada','New Hampshire',
'New Jersey','New Mexico','New York','North Carolina','North Dakota','Ohio','Oklahoma','Oregon','Pennsylvania',
'Rhode Island','South Carolina','South Dakota','Tennessee','Texas','Utah','Vermont','Virginia','Washington',
'West Virginia','Wisconsin','Wyoming','US']
fips = ['01', '02', '04', '05', '06', '08', '09',
'10', '11', '12', '13', '15', '16', '17',
'18', '19', '20', '21', '22', '23', '24',
'25', '26', '27', '28', '29', '30', '31',
'32', '33', '34', '35', '36', '37', '38',
'39', '40', '41', '42', '44', '45', '46',
'47', '48', '49', '50', '51', '53', '54',
'55', '56', 'US']
state = states[np.where(np.array(fips)==loc)[0][0]]
data_hosp = data_hosp[data_hosp['state'] == state]
maxes = []
for season in data_hosp['season'].unique():
hosp_season = data_hosp[data_hosp['season'] == season]
maxes.append(hosp_season['incidH'].max())
return np.array(maxes)
def get_cuml_season(loc):
data_hosp = pd.read_csv(r'~/Documents/weekly_flu_incid_complete_fixed.csv')
data_hosp = data_hosp[(data_hosp['season'] == '2022-23') | (data_hosp['season'] == '2023-24')]
states = ['Alabama','Alaska','Arizona','Arkansas','California','Colorado','Connecticut','Delaware','District of Columbia',
'Florida','Georgia','Hawaii','Idaho','Illinois','Indiana','Iowa','Kansas','Kentucky','Louisiana','Maine','Maryland',
'Massachusetts','Michigan','Minnesota','Mississippi','Missouri','Montana','Nebraska','Nevada','New Hampshire',
'New Jersey','New Mexico','New York','North Carolina','North Dakota','Ohio','Oklahoma','Oregon','Pennsylvania',
'Rhode Island','South Carolina','South Dakota','Tennessee','Texas','Utah','Vermont','Virginia','Washington',
'West Virginia','Wisconsin','Wyoming','US']
fips = ['01', '02', '04', '05', '06', '08', '09',
'10', '11', '12', '13', '15', '16', '17',
'18', '19', '20', '21', '22', '23', '24',
'25', '26', '27', '28', '29', '30', '31',
'32', '33', '34', '35', '36', '37', '38',
'39', '40', '41', '42', '44', '45', '46',
'47', '48', '49', '50', '51', '53', '54',
'55', '56', 'US']
state = states[np.where(np.array(fips)==loc)[0][0]]
data_hosp = data_hosp[data_hosp['state'] == state]
cuml = []
for season in data_hosp['season'].unique():
hosp_season = data_hosp[data_hosp['season'] == season]
if season == '2023-24':
lb = hosp_season[hosp_season['yr.wk'] == 2023.30].index[0]
ub = hosp_season[hosp_season['yr.wk'] == 2024.23].index[0]
else:
lb = hosp_season[hosp_season['yr.wk'] == 2022.30].index[0]
ub = hosp_season[hosp_season['yr.wk'] == 2023.23].index[0]
hosp_season = hosp_season.loc[lb:ub]
cuml.append(hosp_season['incidH'].sum())
cuml = np.array(cuml)
return [np.floor(0.75 * np.min(cuml)), np.ceil(1.25 * np.max(cuml))]
def scenario_plot(loc, formatted, display=True):
states = ['Alabama', 'Alaska', 'Arizona', 'Arkansas', 'California', 'Colorado', 'Connecticut', 'Delaware', 'District of Columbia',
'Florida', 'Georgia', 'Hawaii', 'Idaho', 'Illinois', 'Indiana', 'Iowa', 'Kansas', 'Kentucky', 'Louisiana', 'Maine', 'Maryland',
'Massachusetts', 'Michigan', 'Minnesota', 'Mississippi', 'Missouri', 'Montana', 'Nebraska', 'Nevada', 'New Hampshire',
'New Jersey', 'New Mexico', 'New York', 'North Carolina', 'North Dakota', 'Ohio', 'Oklahoma', 'Oregon', 'Pennsylvania',
'Rhode Island', 'South Carolina', 'South Dakota', 'Tennessee', 'Texas', 'Utah', 'Vermont', 'Virginia', 'Washington',
'West Virginia', 'Wisconsin', 'Wyoming', 'US']
fips = ['01', '02', '04', '05', '06', '08', '09', '10', '11', '12', '13', '15', '16', '17', '18', '19', '20', '21', '22', '23', '24',
'25', '26', '27', '28', '29', '30', '31', '32', '33', '34', '35', '36', '37', '38', '39', '40', '41', '42', '44', '45', '46',
'47', '48', '49', '50', '51', '53', '54', '55', '56', 'US']
state = states[fips.index(loc)]
fig, axs = plt.subplots(3, 2, figsize=(10, 15))
start_date = pd.to_datetime(formatted['origin_date'].unique()[0])
past_maxes = get_max_season(loc)
global_max_val = global_max(loc, formatted)
scenarios = formatted['scenario_id'].unique()
for i, scenario in enumerate(scenarios):
row, col = divmod(i, 2)
data = formatted[(formatted['age_group'] == '0-130') & (formatted['location'] == loc) & (formatted['scenario_id'] == scenario)]
pivoted_data = data.pivot(index='run_grouping', columns='horizon', values='value')
quantiles = pivoted_data.quantile([0.025, 0.25, 0.75, 0.975])
axs[row, col].fill_between(pivoted_data.columns, quantiles.loc[0.025], quantiles.loc[0.975], alpha=0.2, color='black')
axs[row, col].fill_between(pivoted_data.columns, quantiles.loc[0.25], quantiles.loc[0.75], alpha=0.2, color='black')
axs[row, col].set_ylim(0, global_max_val)
max_2022_23, max_2023_24 = past_maxes
if max_2022_23 > max_2023_24:
axs[row, col].axhline(max_2022_23, color='red', linestyle='--', label=f'2022-23 Max: {int(np.round(max_2022_23))}')
axs[row, col].text(0, max_2022_23, f'2022-23 Max: {int(np.round(max_2022_23))}', color='red', ha='left', va='bottom')
axs[row, col].axhline(max_2023_24, color='red', linestyle='--')
axs[row, col].text(0, max_2023_24, f'2023-24 Max: {int(np.round(max_2023_24))}', color='red', ha='left', va='top')
else:
axs[row, col].axhline(max_2023_24, color='red', linestyle='--', label=f'2023-24 Max: {int(np.round(max_2023_24))}')
axs[row, col].text(0, max_2023_24, f'2023-24 Max: {int(np.round(max_2023_24))}', color='red', ha='left', va='bottom')
axs[row, col].axhline(max_2022_23, color='red', linestyle='--')
axs[row, col].text(0, max_2022_23, f'2022-23 Max: {int(np.round(max_2022_23))}', color='red', ha='left', va='top')
if col == 0:
axs[row, col].set_ylabel('Weekly Hosp. Incid.')
else:
axs[row, col].set_yticklabels([])
axs[row, col].set_title(f'Scenario: {scenario}')
axs[row, col].set_xlim([0, 44])
axs[row, col].set_xticks(pivoted_data.columns[::4])
if row == 2:
axs[row, col].set_xticklabels([get_saturday_date(week, start_date).strftime('%Y-%m-%d') for week in pivoted_data.columns[::4]], rotation=45)
else:
axs[row, col].set_xticklabels([''] * len(pivoted_data.columns[::4]))
fig.suptitle(state, ha='center', va='bottom')
plt.tight_layout()
save_path = './model_plots'
fig.savefig(fname=f'{save_path}/scenariohub_{state}.pdf', bbox_inches='tight')
if display:
plt.show()
#############################
usage
scenario_plot('01', formatted, display=True) |
sample output from this newest block of code |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Labels
enhancement
Request for improvement or addition of new feature(s).
gempyor
Concerns the Python core.
medium priority
Medium priority.
plotting
Relating to plotting and/or visualizations.
post-processing
Concern the post-processing.
Label
enhancement, gempyor, plotting, post-processing
Priority Label
medium priority
Is your feature request related to a problem? Please describe.
This issue was originally reported by @MacdonaldJoshuaCaleb in GH-413.
When trying to assess scenario plots and/or model fit to empirical data there are a number of common targets across pathogens that it would be useful to have plotted together with sample time trajectories. This can be done fairly easily with seaborn and subfigures in matplotlib. Here's an implementation of the basic idea for a given set of results lists like is returned by the gempyor package, which is a list of data frames, this should probably be generalized to be able to read the .parquet files from the model_output folder if using other gempyor functions that populatte the model folder if the inference object is set to save.
Is your feature request related to a new application, scenario round, pathogen? Please describe.
SMH submissions
Describe the solution you'd like
incorporate something like the above function into default post-processing plots with automated post processing like currently exists for R-inference runs and will hopefully (soon) exist for emcee runs
Code without scenario facets:
Code with scenario facets:
These outputs look like
plot_H1N1_High_Vax_Alabama.pdf and
plot_H1N1_scenario_comp_Alabama.pdf, respectively. These two plotting methods share a lot of code but have different interfaces, probably makes sense to keep the two different interfaces and extract the core logic to one function.
The text was updated successfully, but these errors were encountered: