Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Optimize Memory Usage and Performance #23

Closed
sdewitt-newrelic opened this issue Apr 18, 2024 · 0 comments · Fixed by #24
Closed

Optimize Memory Usage and Performance #23

sdewitt-newrelic opened this issue Apr 18, 2024 · 0 comments · Fixed by #24
Labels
enhancement New feature or request

Comments

@sdewitt-newrelic
Copy link
Contributor

Summary

Description from customer

After getting the monitoring solution to work in our infrastructure with a few event log file types, we added the full list of event log files we’re interested in. After doing so, the event log file monitoring process started crashing with OOMKilled (meaning that it exceeded its allocated memory on kubernetes). At this point, the service was running with a maximum memory allocation of 2GB of RAM which got consumed ~1 minute after application startup.

After increasing the RAM allocation to 3Gb I was not able to see any more OOMKilled, however, the pod would crash with no exception message (corresponding kubernetes event: Back-off restarting failed container).

Desired Behavior

Description from customer

The CSV parsing needs to have more memory awareness baked into it. For example, once the log object reaches X % of available memory, then we should call post_logsfunction to POST logs to NewRelic instead of waiting for the whole CSV file to be parsed.
Additionally, the log array needs to be cleared in between iterations to release memory.

Possible solution:
The solution doesn’t take into account the reality of how huge salesforce event log files can be. As highlighted by Scott as well, it seems like we’re ready entire CSVs into memory potentially contributing to the OOM issues
It also seems like download_response is downloading the whole event log file CSV directly into memory, before we even call parse_csv. The parse_csv output therefore additionally allocates even more memory for the very same event log file when its output is stored in csv_rows. If I get that right, it means we’re double allocating memory for the same event log file processing. Lastly, we seem to be executing download_response and parse_csv for all event log files supplied by user prior to calling NewRelic.post_logs. I believe streaming the downloaded content and processing it in a pipeline end to end as proposed by Scott can help ease that.

@sdewitt-newrelic sdewitt-newrelic added the enhancement New feature or request label Apr 18, 2024
@sdewitt-newrelic sdewitt-newrelic linked a pull request Apr 18, 2024 that will close this issue
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant