You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
If Kafka were to fail and events couldn't be produced, we currently end up with log messages, but do not yet a way to easily grab these messages from the logs and send them to Kafka.
A/C:
a procedure for resending failed events to the event bus
documentation (probably a runbook) on how to do this.
Implementation Details:
The idea for implementation is to create a Splunk or New Relic search that will isolate the failed events and allow us to download/export the details of these events.
Then we could run a script to resend those events to the event bus.
Have a strategy for dealing with API keys/permissions to send data to event bus.
For testing: can we replicate a breakage in stage/other environment? and then can we test resending the events?
The text was updated successfully, but these errors were encountered:
We've discussed the fact that order matters, so this can't be as simple as re-sending failed events, because there might be successful events that need to be sent afterwards.
One idea is to cobble together scripts, how-to, and a management command that can take a batch of events to resend.
For lifecycle events (e.g. create, update, delete), a failed create event may require resending the create event and any following update or delete events.
Is this data we can through the data on the topic itself? Would we just look up data in the DB and send the final update event? The answer may change based on what is simplest and good enough for each event type.
Note: This ticket is meant to be a quick fix (if there is one) to hold over before investing in the outbox pattern, which is ticketed separately. See: openedx/openedx-events#251.
If Kafka were to fail and events couldn't be produced, we currently end up with log messages, but do not yet a way to easily grab these messages from the logs and send them to Kafka.
A/C:
Implementation Details:
The text was updated successfully, but these errors were encountered: