How to deal with obs and simh if they have different time lengths? #65
-
Beta Was this translation helpful? Give feedback.
Replies: 1 comment 2 replies
-
Hey @Pan-Yuxian, thank you! I'm not sure about your data, about have you tried to use less quantiles? If you have 30 years of I assume "daily" data, i.e. 10.950 days in total, having 2000 quantiles, means 2000 bins in which these 10 K values get assigned. That can be a problem as QM uses interpolation (see https://python-cmethods.readthedocs.io/en/latest/src/methods.html#quantile-mapping). I tried to reproduce it on some data of mine, but was not able to plot even something that comes into the direction of yours. Could you provide your obs and simh data sets to contact@b-schwertfeger.de? These are my tries, for example using the data sets provided in this repository: from cmethods import adjust
from cmethods.distribution import quantile_mapping
import xarray as xr
import matplotlib.pyplot as plt
obs = xr.open_dataset("examples/input_data/observations.nc")["tas"]
simp = xr.open_dataset("examples/input_data/control.nc")["tas"]
simh = simp.copy(deep=True)[3650:]
#bc = adjust(
# method="quantile_mapping",
# obs=obs,
# simh=simh,
# simp=simh,
# n_quantiles=200,
#)
kwargs = {"n_quantiles":2000, "kind":"+"}
qm_adjusted = xr.apply_ufunc(quantile_mapping,
obs,
simh.rename({"time": "t1"}),
simp.rename({"time": "t2"}),
dask = "parallelized",
vectorize = True,
input_core_dims = [["time"], ["t1"], ["t2"]],
output_core_dims = [["t2"]],
kwargs = dict(kwargs)
)
qm_adjusted = qm_adjusted.rename({"t2": "time"}).transpose(*obs.dims)
plt.figure(figsize=(10,5),dpi=216)
obs.groupby("time.dayofyear").mean(...).plot(label="$T_{obs,h}$",color="black")
simh.groupby("time.dayofyear").mean(...).plot(label="$T_{sim,h}$",color="blue")
simp.groupby("time.dayofyear").mean(...).plot(label="$T_{sim,p}$",color="red")
bc["tas"].groupby("time.dayofyear").mean(...).plot(label="$T^{QM}_{sim,p}$",color="green")
plt.title("Historical modeled and obseved temperatures; and corrected temperatures")
plt.xlim(0,365)
plt.gca().grid(alpha=.3)
plt.legend(); Also thanks a lot for bringing my attention to the restriction to equally sized input data for the control period. I will adjust this part to avoid implementing workarounds like yours. |
Beta Was this translation helpful? Give feedback.
AH - yes, I have seen my fault. I have found the reason for this. It was a bit tricky. The problem is that the cumulative distribution functions for obs and simh will have the length of n_quantiles +1 BUT because of the different lengths of these data sets, they dont count up to the same number. Thus when having less values for simh thant for obs, we will always miss high values.