Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

extract_input_target_forcings add option for left-justification of train/eval #56

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

tvonich
Copy link

@tvonich tvonich commented Jan 18, 2024

Provides optionality for a left-justified evaluation and training with kwarg justify = 'left' or 'right'. Changes are from lines 281 - 322, line 361, and line 381.

Per issue discussion with Timo Ewalds #32

justify = 'left' is now the default and I believe that makes sense based on normal user expectation of left to right progression of time when doing a forecast and selecting data.

To explain what is different now, I'll use the following example.

We load dataset_source-era5_date-2022-01-01_res-1.0_levels-13_steps-12.nc in the Jupyter Notebook.

There are 14 time periods along the data dimension for this netcdf: 2 initial states + 12 steps

Let's visualize them as: 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14

We set Eval Steps and Train Steps to 4 on the accompanying slider and extract the data with extract_input_target_forcings.

With the previous extract_input_target_forcings function, the inputs would've been 9, 10 and the targets would've been 11, 12, 13, 14 for the designated Eval and Train Steps. This is a bit misleading because if I want to do a 24-hour forecast starting at 0Z 01-01-2022, I expect my forecast end time to be 0Z 01-02-2022. In the right justified method, the forecast always ends on the last time element within the example_data.

With the new left-justified default, the inputs are always the 1 and 2, with the targets following immediately after. So in the above example, that would be 1, 2 for inputs and 3, 4, 5, 6 for targets.

Provides optionality for a left-justified evaluation and training with kwarg justify = 'left' or 'right'
@tvonich
Copy link
Author

tvonich commented Feb 1, 2024

Tried to accomplish the CLA...may be doing it wrong. If any clarification is available that'd be great.

Copy link
Contributor

@tewalds tewalds left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This would need to be rebased.



def extract_input_target_times(
dataset: xarray.Dataset,
input_duration: TimedeltaLike,
target_lead_times: TargetLeadTimes,
justify: str
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is probably better as an enum instead of string.

if justify == 'left':
# Inputs correspond to the first time elements within the input duration
# Targets follow immediatly after per the target lead times
target_start_time = int(input_duration.total_seconds()/3600/6)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This shouldn't be 6h specific.

inputs = dataset.isel(time=slice(int(target_start_time)))
inputs['time'] = inputs['time'] - input_duration + time[1]

targets = dataset.isel(time=slice(target_start_time,target_end_time))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please take a look at the google python style guide. Some general comments:

  • There should be spaces between function arguments, no trailing spaces, etc.
  • Comments are in english and should use punctuation (eg missing periods above).
  • Remove debugging statements (commented out code, print statements below).

@@ -256,6 +216,16 @@ def extract_input_target_times(
(inclusive) lead times, or a sequence of lead times. Lead times should be
Timedeltas (or something convertible to). They are given relative to the
final input timestep, and should be positive.
justify: Defines whether inputs and targets are extracted from the beginning
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Any chance you can add a test that this does what you'd expect?

@tvonich
Copy link
Author

tvonich commented Feb 23, 2024

I'll work through these in the next week or two when able. And yes, I can show that it's working as expected.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants