Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

predict(tidem()) uses all constituents, unlike T_TIDE #1654

Closed
dankelley opened this issue Feb 8, 2020 · 3 comments
Closed

predict(tidem()) uses all constituents, unlike T_TIDE #1654

dankelley opened this issue Feb 8, 2020 · 3 comments

Comments

@dankelley
Copy link
Owner

Preamble this stems from #1653 but I wanted to split it, so the title would be informative for users. (Question for @richardsc and @clayton33 at the end. I'd like to decide this within a week, so it's not a terrible rush and you two might find it helpful to talk in person next week sometime.)

I can see some merit in what T_TIDE is doing but it does seem to be an odd approach for a default, since it is not analogous to how regression tends to be done ... but maybe there is a history for that, in the Foreman methodology and in the methodologies that he was building on. (This is all very old stuff; Foreman was just the first that I know of to document his work, and to share his code widely.)

I don't really want to add this 'ignore ill-defined constituents' in oce, because it is just a bit weird, e.g. is the SNR the right measure? what is the critical value? Also, if the user wants, they can just do the full tidem, look at the summary, and then decide for themselves, based on whatever they think about p values, what to do next. That's the R way, and I think the R designers (really, the S designers, and the stats people going back decades before S) knew what they were doing. Plus, R users are familiar with the approach of looking at summary output and making decisions on regression models.

I would love to get your opinion on this side issue of ignoring constituents in tidem predictions, Clark. Maybe it should be a different issue.

As @richardsc pointed out in #1653 and as came up in our f2f a few days ago: oce::predict,tidem-method (i.e. a prediction based on tidem() output) uses all the constituents, not just the ones that are considered significant. By contrast, the matlab program, T_TIDE, uses only those constituents that are considered trustworthy, based on a signal-to-noise (SNR) ratio.

The oce::predict,tidem-method approach is consistent with how lm() and similar functions work in R.

I can see some merit in what T_TIDE is doing. However, it seem to be an odd approach for a default, since it is not analogous to how regression tends to be done in general analysis. Maybe there is a history for this T_TIDE approach (I've not checked [1] yet). Maybe it goes back to the Foreman methodology, and the methodologies that he was building on.

I don't propose to add this 'ignore ill-defined constituents' in oce, because it is just a bit weird. Do we know that the SNR the right measure? How do we decide on the critical SNR value? And is outright removal sensible (as opposed to downweighting as is done for robust statistical methods)?

Also, if the oce user wants, they can just do the full tidem, look at the summary, and then decide for themselves, based on whatever they think about p values, what to do next. That's the familiar R way, and I think the R designers (really, the S designers, and the stats people going back decades before S) knew what they were doing.

My proposal is to keep the behaviour the same in oce, but to document the difference to T_TIDE. Is this OK, @richardsc and @clayton33? Please respond with a comment, or a thumbs-up or thumbs-down. And thanks for looking at such a long comment!

References

  1. Pawlowicz, Rich, Bob Beardsley, and Steve Lentz. “Classical Tidal Harmonic Analysis Including Error Estimates in MATLAB Using T_TIDE.” Computers & Geosciences 28, no. 8 (2002): 929–37.
@stale
Copy link

stale bot commented Mar 9, 2020

This issue has been automatically marked as stale because no comments have been made in a month. Unless comments are made in the next week (or the 'pinned' label is applied), this issue will be closed automatically in a week.

@stale stale bot added the stale label Mar 9, 2020
@dankelley dankelley added the pinned Prevent automatic stale designation label Mar 9, 2020
@stale stale bot removed the stale label Mar 9, 2020
@dankelley dankelley removed the pinned Prevent automatic stale designation label Apr 5, 2021
@stale
Copy link

stale bot commented Jun 11, 2021

This issue has been automatically marked as stale because no comments have been made in two weeks. Unless comments are made in the next two weeks (or the 'pinned' label is applied), this issue will be closed automatically in a week.

@stale stale bot added the stale label Jun 11, 2021
@stale
Copy link

stale bot commented Jul 11, 2021

This issue has been closed automatically, because no comments have been made in the two weeks since inactivity caused the 'stale' bot to mark it as stale.

@stale stale bot closed this as completed Jul 11, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant