Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix(relay/client):error handling for rpc's and replacing errgroup with sync.WaitGroup #55

Merged
merged 1 commit into from
May 22, 2024

Conversation

Prateeknandle
Copy link
Contributor

@Prateeknandle Prateeknandle commented May 21, 2024

  1. Refactors the connectToKubeArmor function to replace the usage of errgroup.Group with sync.WaitGroup for managing goroutines and introduces a mechanism to stop all goroutines if any one of them encounters an error.

  2. Introduced a shared stop channel to signal all goroutines to stop if any one of them encounters an error.

  3. Added a buffered errCh channel to propagate errors from the goroutines back to the connectToKubeArmor function.

  4. Implemented error propagation within each watcher function. If a goroutine encounters an error, it sends the error to errCh and returns.

  5. Updated connectToKubeArmor to listen for errors on errCh and to close the stop channel to signal all goroutines to stop if an error occurs.

  6. Ensured that it waits for all goroutines to finish before proceeding with client destruction if no error is found.

Tested manually by scheduling an error. After the error hit, the other goroutines will end and new connection will be made with the rpc's. Also checked logs and alerts after error, they are working as expected.

Logs, shows reconnection happens:

{"level":"info","ts":"2024-05-21 19:14:42.122527","caller":"log/logger.go:54","msg":"Checked the liveness of KubeArmor's gRPC service (192.168.29.12:32767)"}
{"level":"info","ts":"2024-05-21 19:14:42.122699","caller":"log/logger.go:49","msg":"Started to watch messages from 192.168.29.12:32767"}
{"level":"info","ts":"2024-05-21 19:14:42.122813","caller":"log/logger.go:49","msg":"Started to watch alerts from 192.168.29.12:32767"}
{"level":"info","ts":"2024-05-21 19:14:42.122896","caller":"log/logger.go:49","msg":"Started to watch logs from 192.168.29.12:32767"}

{"level":"warn","ts":"2024-05-21 19:17:12.151194","caller":"log/logger.go:79","msg":"failed to receive a log (192.168.29.12:32767) rpc error: code = Unknown desc = Just dying after 150 seconds"}
// error in watchlogs
{"level":"info","ts":"2024-05-21 19:17:12.151546","caller":"log/logger.go:54","msg":"Destroyed the client (192.168.29.12:32767)"}
{"level":"info","ts":"2024-05-21 19:17:12.527710","caller":"log/logger.go:54","msg":"Checked the liveness of KubeArmor's gRPC service (192.168.29.12:32767)"}
{"level":"info","ts":"2024-05-21 19:17:12.527820","caller":"log/logger.go:49","msg":"Started to watch messages from 192.168.29.12:32767"}
{"level":"info","ts":"2024-05-21 19:17:12.527858","caller":"log/logger.go:49","msg":"Started to watch alerts from 192.168.29.12:32767"}
{"level":"info","ts":"2024-05-21 19:17:12.527886","caller":"log/logger.go:49","msg":"Started to watch logs from 192.168.29.12:32767"}

Signed-off-by: Prateek Nandle <prateeknandle@gmail.com>
@DelusionalOptimist DelusionalOptimist merged commit 65910d6 into kubearmor:main May 22, 2024
6 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants