Functional performance #99
Replies: 5 comments 4 replies
-
I did the same analysis for discharge. The x-axis in these plots is the difference in the transfer entropy (modeled - observed) from discharge (seg_upstream_inflow) to DOmin,mean,max. Again x=0 means that the model is matching patterns in the observed data. The interesting thing about discharge is that it is functionally an input in metab, and multitask (through the Zt term), but not in the baseline model.
|
Beta Was this translation helpful? Give feedback.
-
Cool, @galengorski. Thanks for putting this together. I like the approach. It's nice to have something to chew on. Thinking about the multitask model- for the train/val sites, the multitask model slightly decreased predictive performance, but it also decreased functional performance. This is very interesting! I'm very interested to see the functional performance at one (or all) of the validation sites, since the multitask model increased predictive performance. How hard would it be for you to produce similar plots for a val site? Also, I just wanted to note we shouldn't put much stock into the analysis of the |
Beta Was this translation helpful? Give feedback.
-
Beta Was this translation helpful? Give feedback.
-
Using functional performance to compare the baseline model to the metab_dense model.The baseline model predicts do_min, do_mean, and do_max, while the metab_dense model predicts GPP, ER, K, temperature, and depth, then uses a dense layer with tuned weights to convert those predictions into do_min, do_mean, and do_max. This is the transfer entropy from solar radiation to do_min, do_mean, and do_max for the baseline (blue) and metab_dense (red) models at different time lags. Transfer entropy is averaged across all sites. Transfer entropy is the amount of uncertainty reduced in the DO variable through knowledge of solar radiation at that time lag, independent of the DO variable history. The TE is normalized by the entropy of the DO variable. So a TE value of 0.05 at a time lag of 1 day means that knowing yesterday’s solar radiation value reduces uncertainty in today’s DO by 5%. Focusing first on the black line, we see that at a time lag of 1 day, DO_max has the highest TE compared to do_mean, and do_min. This makes sense, as DO_max is more strongly influenced by solar radiation (through GPP) than the other variables. These plots show that both models are underutilizing solar radiation across all time lags. However, the metab_dense model is doing a better job at representing the relationships between solar radiation and DO. An interpretation is that the metab dense model is doing better at representing the relationships between solar radiation and do_max because it is being explicitly trained on GPP, which represents the linkage between solar radiation and DO max. |
Beta Was this translation helpful? Give feedback.
-
Now focusing on a lag of one day, the maximum TE value for DO_max, we can calculate the functional performance as the difference between the modeled and observed transfer entropy for each model. We calculate the functional performance for each site individually to see where the functional performance is improving the most. Here we are looking at the functional performance on the y-axis, 0 is optimal functional performance, meaning that the modeled solar radiation to DO relationship exactly matches the observed relationship. On the x-axis is the NHD segment canopy coverage within a 100 meter buffer. I wouldn’t neccesarily expect there to be a strong relationship between the canopy coverage and the solar radiation - DO relationship since solar radiation is measured above the canopy, but you can see how the metab_dense model increases functional performance across almost all sites in all DO variables. This suggests that by using solar radiation to predict GPP and then DO (red dots) instead of directly DO (blue dots), the solar radiation to DO relationship is more accurately represented across all sites and DO variables. |
Beta Was this translation helpful? Give feedback.
-
To try to understand why different versions of the model were performing differently, I tried to analyze the model functional performance. The analysis is based on this paper by Ruddell et al. I focused on a single site 01481000, Chadds Ford on Brandywine Creek. It is a training and validation site. I used model results from 3 models:
0_baseline_lstm
-- predicting DO1a_lstm_metab_just_metab
-- predicting metabolism and physical parameters and using deterministic equations to predict DO2_metab_multitask
-- predicting DO, metabolism, and physical parametersThe plot below shows the RMSE of the validation and training data together on the y-axis. For the x-axis, I calculated the transfer entropy between air temp (seg_tave_air) and DO (min,mean,max), then took the difference between the modeled transfer entropy and the observed transfer entropy. An x value of 0 indicates that the model is extracting the same amount of information from air temperature, a positive value indicates an "over-deterministic" model and a negative value indicates "over-random" behavior. The models are color coded.
Beta Was this translation helpful? Give feedback.
All reactions