-
Notifications
You must be signed in to change notification settings - Fork 28
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Running de-client --stop-channel <channel> results in KeyError #275
Comments
Adding some steps to reproduce at @knoepfel's request: Running on fermicloud117, this error occurs on the channels we tested (and we believe every channel) if you start the decisionengine, and stop any channel:
Then look in /var/log/decisionengine/Nersc.log, for the above example. We can see that this error occurs when stopping a channel, and not before, by tailing the log from one terminal and running the above from another ( |
Handle KeyError exception Issue: HEPCloud/decisionengine#275 RB : https://fermicloud140.fnal.gov/reviews/r/363/
As of the current 1.6 release it still throws a KeyError with StopChannel.. whatever was handling this exception didn't work right. 2021-03-19T10:56:08-0500 - root - TaskManager - 25328 - Thread-3 - ERROR - error in decision cycle(publishers) |
Now there are no KeyErrors at shutdown of the decision engine with 1.6.1. Closing. |
While troubleshooting decisionengine_modules issue 200, we found that running
de_client --stop-channel <channel_name>
produced a KeyError in TaskManager.decision_cycle that can be seen in the logs. This is true for multiple channels (we tested Nersc, gce, resource_request, and a dummy no-op channel called "test_channel). Here are two examples of the errors we found, for Nersc and gce respectively:
The de-client command returned "OK", so unless one looked at the logs, there would be no way of knowing this.
The text was updated successfully, but these errors were encountered: