predict(tidem()) uses all constituents, unlike T_TIDE #1654

dankelley · 2020-02-08T12:30:58Z

Preamble this stems from #1653 but I wanted to split it, so the title would be informative for users. (Question for @richardsc and @clayton33 at the end. I'd like to decide this within a week, so it's not a terrible rush and you two might find it helpful to talk in person next week sometime.)

I can see some merit in what T_TIDE is doing but it does seem to be an odd approach for a default, since it is not analogous to how regression tends to be done ... but maybe there is a history for that, in the Foreman methodology and in the methodologies that he was building on. (This is all very old stuff; Foreman was just the first that I know of to document his work, and to share his code widely.)

I don't really want to add this 'ignore ill-defined constituents' in oce, because it is just a bit weird, e.g. is the SNR the right measure? what is the critical value? Also, if the user wants, they can just do the full tidem, look at the summary, and then decide for themselves, based on whatever they think about p values, what to do next. That's the R way, and I think the R designers (really, the S designers, and the stats people going back decades before S) knew what they were doing. Plus, R users are familiar with the approach of looking at summary output and making decisions on regression models.

I would love to get your opinion on this side issue of ignoring constituents in tidem predictions, Clark. Maybe it should be a different issue.

As @richardsc pointed out in #1653 and as came up in our f2f a few days ago: oce::predict,tidem-method (i.e. a prediction based on tidem() output) uses all the constituents, not just the ones that are considered significant. By contrast, the matlab program, T_TIDE, uses only those constituents that are considered trustworthy, based on a signal-to-noise (SNR) ratio.

The oce::predict,tidem-method approach is consistent with how lm() and similar functions work in R.

I can see some merit in what T_TIDE is doing. However, it seem to be an odd approach for a default, since it is not analogous to how regression tends to be done in general analysis. Maybe there is a history for this T_TIDE approach (I've not checked [1] yet). Maybe it goes back to the Foreman methodology, and the methodologies that he was building on.

I don't propose to add this 'ignore ill-defined constituents' in oce, because it is just a bit weird. Do we know that the SNR the right measure? How do we decide on the critical SNR value? And is outright removal sensible (as opposed to downweighting as is done for robust statistical methods)?

Also, if the oce user wants, they can just do the full tidem, look at the summary, and then decide for themselves, based on whatever they think about p values, what to do next. That's the familiar R way, and I think the R designers (really, the S designers, and the stats people going back decades before S) knew what they were doing.

My proposal is to keep the behaviour the same in oce, but to document the difference to T_TIDE. Is this OK, @richardsc and @clayton33? Please respond with a comment, or a thumbs-up or thumbs-down. And thanks for looking at such a long comment!

References

Pawlowicz, Rich, Bob Beardsley, and Steve Lentz. “Classical Tidal Harmonic Analysis Including Error Estimates in MATLAB Using T_TIDE.” Computers & Geosciences 28, no. 8 (2002): 929–37.

The text was updated successfully, but these errors were encountered:

stale · 2020-03-09T13:56:27Z

This issue has been automatically marked as stale because no comments have been made in a month. Unless comments are made in the next week (or the 'pinned' label is applied), this issue will be closed automatically in a week.

stale · 2021-06-11T02:02:57Z

This issue has been automatically marked as stale because no comments have been made in two weeks. Unless comments are made in the next two weeks (or the 'pinned' label is applied), this issue will be closed automatically in a week.

stale · 2021-07-11T20:55:38Z

This issue has been closed automatically, because no comments have been made in the two weeks since inactivity caused the 'stale' bot to mark it as stale.

dankelley added development tidem discussion labels Feb 8, 2020

dankelley self-assigned this Feb 8, 2020

stale bot added the stale label Mar 9, 2020

dankelley added the pinned Prevent automatic stale designation label Mar 9, 2020

stale bot removed the stale label Mar 9, 2020

dankelley removed the pinned Prevent automatic stale designation label Apr 5, 2021

stale bot added the stale label Jun 11, 2021

stale bot closed this as completed Jul 11, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

predict(tidem()) uses all constituents, unlike T_TIDE #1654

predict(tidem()) uses all constituents, unlike T_TIDE #1654

dankelley commented Feb 8, 2020

stale bot commented Mar 9, 2020

stale bot commented Jun 11, 2021

stale bot commented Jul 11, 2021

predict(tidem()) uses all constituents, unlike T_TIDE #1654

predict(tidem()) uses all constituents, unlike T_TIDE #1654

Comments

dankelley commented Feb 8, 2020

stale bot commented Mar 9, 2020

stale bot commented Jun 11, 2021

stale bot commented Jul 11, 2021