Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Errno 90 Stuck loop #27

Open
SpiralBlu opened this issue May 31, 2024 · 3 comments
Open

Errno 90 Stuck loop #27

SpiralBlu opened this issue May 31, 2024 · 3 comments

Comments

@SpiralBlu
Copy link

SpiralBlu commented May 31, 2024

We have been getting an "[Errno 90] Message too long" for the last few weeks that gets the UDPStreamer in udp.py stuck in a loop.

I noticed the except statement just waits and tries again in an endless loop trying to send the same log. I dropped a break in there instead (left the log line so I could see when it happens) and we are able to continue sending logs (probably on a new source port) and we do not have to kill processes or reboot (./encore.sh stop does not work, just hangs) and this appears to fix the endless loop that stops sending logs requiring manual intervention.

We do have a case open with Cisco on what is sending these longer that 1500 MTU sized logs (maybe drop the log that's too long to another file for review?), but our network is 1500 and that's not going to change anytime soon. We believe these may be truncated logs that would end in ... that break our JSON parsers and would rather it drop the log at the eNcore streamer.

RHEL is the platform we're using but we were seeing the same issues on Debian before having to migrate. That VM has been wiped since.

Ex:

except OSError as ex: 
        self.logger.error("Error [{0}] writing to endpoint {1}:{2} -- Restarting UDPStream...".format(ex, self.host, self.port))
        break
@SpiralBlu
Copy link
Author

SpiralBlu commented May 31, 2024

Took my own suggestion up and added another line before the break to log the log causing the issue:
self.logger.error("Log: {0}".format(data))

It appears to be a log with packet data in it, so of course that's going to be the MTU plus the wrapped log data. We will provide that to Cisco with our case.

@SpiralBlu
Copy link
Author

SpiralBlu commented Jun 5, 2024

Determined it was "Intrusion Event Packet Data":
image
No errors in the encore logs since the change.

@SpiralBlu
Copy link
Author

SpiralBlu commented Jun 7, 2024

The more I have looked at the udp.py I have come to a realization (after RHEL was showing "established" UDP connections) that it was built like a TCP connect() and a not true UDP datagram streamer. This causes a bit of a hang up on RHEL as a connection is left open if the program has ceased abruptly.
UDP has no establishment and handshake, nor does it have checksums. I think most of the errors I have encountered would be fixed by changing the udp.py code to not use connect() and instead use sendto().
I started to rebuild it, but the ties in the class construction has me fussing with it a bit yet.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant