Skip to content

Commit

Permalink
[cmd/opampsupervisor] Fix memory leak and enable goleak (#30496)
Browse files Browse the repository at this point in the history
**Description:** <Describe what has changed.>
<!--Ex. Fixing a bug - Describe the bug and how this fixes the issue.
Ex. Adding a feature - Explain what this achieves.-->

`NewSupervisor` starts two goroutines, one for the health check ticker
and another running the agent process. Each was causing a leak. To
resolve the health check ticker goroutine leak a `Stop` call has been
added to it during shutdown. To resolve the agent process leak a wait
group has been added (to make sure shutdown blocks until the agent
process is completely shutdown), and a `break` statement has been
changed to `return`. The `break` statement was only breaking out of the
`select` statement, but since it's nested within a `for` loop, the loop
just kept going instead of completing successfully.


This also enables `goleak` checks for `cmd/opampsupervisor` and
`cmd/opampsupervisor/supervisor` packages

**Link to tracking Issue:** <Issue number if applicable>
#30438

**Testing:** <Describe what testing was performed and which tests were
added.>
Existing tests and new `goleak` checks are passing.
  • Loading branch information
crobert-1 authored Jan 29, 2024
1 parent 7696f21 commit 5a3a3ea
Show file tree
Hide file tree
Showing 3 changed files with 56 additions and 2 deletions.
27 changes: 27 additions & 0 deletions .chloggen/goleak_opampsupervisor.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,27 @@
# Use this changelog template to create an entry for release notes.

# One of 'breaking', 'deprecation', 'new_component', 'enhancement', 'bug_fix'
change_type: bug_fix

# The name of the component, or a single word describing the area of concern, (e.g. filelogreceiver)
component: cmd/opampsupervisor

# A brief description of the change. Surround your text with quotes ("") if it needs to start with a backtick (`).
note: Fix memory leak on shutdown

# Mandatory: One or more tracking issues related to the change. You can use the PR number here if no issue exists.
issues: [30438]

# (Optional) One or more lines of additional information to render under the primary note.
# These lines will be padded with 2 spaces and then inserted directly into the document.
# Use pipe (|) for multiline entries.
subtext:

# If your change doesn't affect end users or the exported elements of any package,
# you should instead start your pull request title with [chore] or use the "Skip Changelog" label.
# Optional: The change log or logs in which this entry should be included.
# e.g. '[user]' or '[user, api]'
# Include 'user' if the change is relevant to end users.
# Include 'api' if there is a change to a library API.
# Default: '[user]'
change_logs: []
14 changes: 14 additions & 0 deletions cmd/opampsupervisor/package_test.go
Original file line number Diff line number Diff line change
@@ -0,0 +1,14 @@
// Copyright The OpenTelemetry Authors
// SPDX-License-Identifier: Apache-2.0

package main

import (
"testing"

"go.uber.org/goleak"
)

func TestMain(m *testing.M) {
goleak.VerifyTestMain(m)
}
17 changes: 15 additions & 2 deletions cmd/opampsupervisor/supervisor/supervisor.go
Original file line number Diff line number Diff line change
Expand Up @@ -14,6 +14,7 @@ import (
"net/http"
"os"
"sort"
"sync"
"sync/atomic"
"text/template"
"time"
Expand Down Expand Up @@ -100,6 +101,7 @@ type Supervisor struct {
opampClient client.OpAMPClient

shuttingDown bool
supervisorWG sync.WaitGroup

agentHasStarted bool
agentStartHealthCheckAttempts int
Expand Down Expand Up @@ -160,7 +162,12 @@ func NewSupervisor(logger *zap.Logger, configFile string) (*Supervisor, error) {
}

s.startHealthCheckTicker()
go s.runAgentProcess()

s.supervisorWG.Add(1)
go func() {
defer s.supervisorWG.Done()
s.runAgentProcess()
}()

return s, nil
}
Expand Down Expand Up @@ -698,7 +705,7 @@ func (s *Supervisor) runAgentProcess() {

case <-s.commander.Done():
if s.shuttingDown {
break
return
}

s.logger.Debug("Agent process exited unexpectedly. Will restart in a bit...", zap.Int("pid", s.commander.Pid()), zap.Int("exit_code", s.commander.ExitCode()))
Expand Down Expand Up @@ -782,6 +789,12 @@ func (s *Supervisor) Shutdown() {
s.logger.Error("Could not stop the OpAMP client", zap.Error(err))
}
}

if s.healthCheckTicker != nil {
s.healthCheckTicker.Stop()
}

s.supervisorWG.Wait()
}

func (s *Supervisor) onMessage(ctx context.Context, msg *types.MessageData) {
Expand Down

0 comments on commit 5a3a3ea

Please sign in to comment.