-
Notifications
You must be signed in to change notification settings - Fork 680
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
extract_input_target_forcings add option for left-justification of train/eval #56
base: main
Are you sure you want to change the base?
extract_input_target_forcings add option for left-justification of train/eval #56
Conversation
Provides optionality for a left-justified evaluation and training with kwarg justify = 'left' or 'right'
Tried to accomplish the CLA...may be doing it wrong. If any clarification is available that'd be great. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This would need to be rebased.
|
||
|
||
def extract_input_target_times( | ||
dataset: xarray.Dataset, | ||
input_duration: TimedeltaLike, | ||
target_lead_times: TargetLeadTimes, | ||
justify: str |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is probably better as an enum instead of string.
if justify == 'left': | ||
# Inputs correspond to the first time elements within the input duration | ||
# Targets follow immediatly after per the target lead times | ||
target_start_time = int(input_duration.total_seconds()/3600/6) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This shouldn't be 6h specific.
inputs = dataset.isel(time=slice(int(target_start_time))) | ||
inputs['time'] = inputs['time'] - input_duration + time[1] | ||
|
||
targets = dataset.isel(time=slice(target_start_time,target_end_time)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please take a look at the google python style guide. Some general comments:
- There should be spaces between function arguments, no trailing spaces, etc.
- Comments are in english and should use punctuation (eg missing periods above).
- Remove debugging statements (commented out code, print statements below).
@@ -256,6 +216,16 @@ def extract_input_target_times( | |||
(inclusive) lead times, or a sequence of lead times. Lead times should be | |||
Timedeltas (or something convertible to). They are given relative to the | |||
final input timestep, and should be positive. | |||
justify: Defines whether inputs and targets are extracted from the beginning |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Any chance you can add a test that this does what you'd expect?
I'll work through these in the next week or two when able. And yes, I can show that it's working as expected. |
Provides optionality for a left-justified evaluation and training with kwarg justify = 'left' or 'right'. Changes are from lines 281 - 322, line 361, and line 381.
Per issue discussion with Timo Ewalds #32
justify = 'left' is now the default and I believe that makes sense based on normal user expectation of left to right progression of time when doing a forecast and selecting data.
To explain what is different now, I'll use the following example.
We load dataset_source-era5_date-2022-01-01_res-1.0_levels-13_steps-12.nc in the Jupyter Notebook.
There are 14 time periods along the data dimension for this netcdf: 2 initial states + 12 steps
Let's visualize them as: 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14
We set Eval Steps and Train Steps to 4 on the accompanying slider and extract the data with extract_input_target_forcings.
With the previous extract_input_target_forcings function, the inputs would've been 9, 10 and the targets would've been 11, 12, 13, 14 for the designated Eval and Train Steps. This is a bit misleading because if I want to do a 24-hour forecast starting at 0Z 01-01-2022, I expect my forecast end time to be 0Z 01-02-2022. In the right justified method, the forecast always ends on the last time element within the example_data.
With the new left-justified default, the inputs are always the 1 and 2, with the targets following immediately after. So in the above example, that would be 1, 2 for inputs and 3, 4, 5, 6 for targets.