-
Notifications
You must be signed in to change notification settings - Fork 11
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Unify IOC/COOPS API #125
Unify IOC/COOPS API #125
Conversation
@SorooshMani-NOAA I pushed on the @brey & @tomsail if you find the time to checkout the |
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## master #125 +/- ##
==========================================
- Coverage 90.18% 89.89% -0.29%
==========================================
Files 13 16 +3
Lines 1049 1366 +317
Branches 155 190 +35
==========================================
+ Hits 946 1228 +282
- Misses 56 78 +22
- Partials 47 60 +13 ☔ View full report in Codecov by Sentry. |
COOPS web API uses redirects, so we need to make sure we always follow redirects for COOPS: |
@pmav99 do you have any suggestions for debugging issues in the multithread/process calls by breakpoint? |
From the top of my head, no, not really. Do the functions work when you run them directly? |
I wanted to attach the debugger and step through the code when testing different inputs. For now I got away with print when I see issues, but I wanted to know if there's any better way of doing it. |
The problem with using But, TBH, this question feels a bit strange. Just write and test the functions that retrieve and parse the data without any concurrency. As soon as the That being said, what can help with debugging multiprocessing/multithreading code is using the logging module, which if configured correctly, can print information about the current thread/process. For example: import logging
import time
import multifutures as mf
logging.basicConfig(
level=10,
style="{",
format="{levelname:8s}; {process:6d}; {threadName:8s}; {asctime:s}; {name:<15s} {lineno:4d}; {message:s}",
force=True,
)
logger = logging.getLogger(__name__)
def foo(a):
time.sleep(0.3)
logger.info("The value of a=%d", a)
logger.info("Start Multiprocessing")
results = mf.multiprocess(foo, [{"a": i} for i in range(5)])
logger.info("Start Multithreading")
results = mf.multithread(foo, [{"a": i} for i in range(5)])
logger.info("Done") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good.
If there are going to be so many functions that are shared between COOPS and IOC we should extract them to a separate module and import them from there.
I agree, I tried to stay as close to the updated IOC code as possible, that's why I basically copy pasted everything and then started modifying. Some of these can be easily put in utils to be used by all the datasource APIs, but also I don't want to start abstracting too soon. |
WRT to the names mapping, I haven't tried this but I was wondering whether it is actually simpler to do something like this:
Yes, the code will be longer and repetitive, but we have no code generation, no |
@pmav99 it's a good idea, I'll do that! |
@pmav99 I'm trying to decide what to do with defaults for datetimes. The issue is for some products:
To resolve the first issue I can just create coops specific But specifying I'm not sure what's best. Should I just leave it to the user and say the default might not work for some products? Or should we add more dictionaries for info about what is the default start date range (relative to now) for each product? I'm not even sure if these are true for all stations! Should I first consult with COOPS? If it's the latter, I think it makes sense to for now go with "default might not work" and later amend. Actually another problem is whether I have to raise error if there's no data or just warn the user? Coops gives me an error message when there's no data. |
searvey/coops.py
Outdated
if "error" in result.result: | ||
msg = json.loads(result.result)["error"]["message"] | ||
logger.error("Encountered an error response!") | ||
raise ValueError(msg) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@pmav99 should I raise
or ignore? What do you think?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think ignore is the correct approach. When dealing with historical data it's not that uncommon to have big gaps. E.g. a station might have been offline for a few months (broken instrument or whatever). This does not mean that the final dataframe will be empty. other requests/urls might return data. Even if the final dataframe is empty, it is not difficult for the user deal with it.
I guess we could add a raise_if_empty
argument to fetch_*()
but I don't think it offers that much value.
@pmav99 except for the two questions above, I'm done with COOPS API. I know that the metadata API needs some adjustment, but I'm assuming we want to keep that separate from the data API update, right? |
WRT the default dates. I have to admit that I haven't really looked into this, but couldn't we have a mapping with the default durations? If the data are not too big we could even return the maximum interval by default. Am I missing something? |
I agree that we might be able to cover most of cases with dictionaries, but I don't want to add too many dicts, like one for how long should I go back before I start having data (e.g. in case of mean, hi/lo, etc.), or how long can I go before there's no data (old data removed (e.g. one minute heights), and so on. I just don't want to go around adding a dict for each of these limitations, especially for these ones that don't have anything in documentation and I just found by trial and error! For now I think I'll just go with let the user figure out ... let's address this later. This PR is already a lot of changes and dictionaries! |
1578e98
to
11e15dc
Compare
@pmav99 I blobbed all of the changes for COOPS:
Note that in this case the tests will fail after first and second commit, since all the coops related tests are updated in the third commit. Since refactoring requires rewriting the tests I cannot combine the 3rd and 1st commits either, it still fails. Unless you prefer I split the 3rd commit and add each part to the 1st and 2nd commit Please let me know what to do |
@pmav99 please let me know if I need to change anything else |
9357962
to
18b6a92
Compare
@SorooshMani-NOAA I disabled the progress bar by default on the new api. let me know what you think about this |
@pmav99 looks good to me. Please go ahead and merge if you're happy with everything as well, thank you! |
- COOPS ignore errors vs raise - Add more documentation for COOPS - Get all types of stations
IOC is always on UTC.
This argument controls whether the tqdm based progress bar will be shown or not.
Addresses #99