Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

move proxy metrics to a separate server #352

Merged
merged 18 commits into from
Oct 17, 2023
Merged
Show file tree
Hide file tree
Changes from 5 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
5 changes: 4 additions & 1 deletion cmd/main.go
Original file line number Diff line number Diff line change
Expand Up @@ -112,6 +112,9 @@ func main() {
informerShutdown <- struct{}{}
})

// Start metrics server
metricsSrv := metrics.StartMetricsServer()

srv := server.New(app)

err = srv.SetupRoutes()
Expand Down Expand Up @@ -141,7 +144,7 @@ func main() {
}
}()

gracefulShutdown(configuration.GracefulTimeout, srv.HTTPServer(), proxySrv)
gracefulShutdown(configuration.GracefulTimeout, srv.HTTPServer(), proxySrv, metricsSrv)
}

func gracefulShutdown(timeout time.Duration, hs ...*http.Server) {
Expand Down
19 changes: 19 additions & 0 deletions deploy/registration-service.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -98,6 +98,7 @@ objects:
ports:
- containerPort: 8080
- containerPort: 8081
- containerPort: 8082
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This file is used only in dev environments. The one which is used in production (and e2e tests) are defined here: https://github.com/codeready-toolchain/host-operator/blob/master/deploy/registration-service/registration-service.yaml
Please create a PR to update that file too.

But what I'm wonder is why the e2e-test still pass? Do we have any e2e tests for this metric service? Without updating https://github.com/codeready-toolchain/host-operator/blob/master/deploy/registration-service/registration-service.yaml the e2e-tests for this PR should fail...

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

could we also add names to the container ports (see for example https://kubernetes.io/docs/concepts/services-networking/service/#field-spec-ports) to make them more explicit, and reuse these names in the services as well?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This file is used only in dev environments.

It's actually not used even in dev environment. This file is a legacy thing from the time when we were able to deploy reg-service on its own - we don't do that anymore (or at least I'm not aware of anyone who would be doing that)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

but anyway, Alexey is right that you need to create PR in host-repo with this change. Let's do it in two steps:

  1. create PR in host-operator repo changing only the values in the template (I don't think that we verify the actual content of the deployment in e2e tests)
  2. when the host-operator PR is merged, create a PR in e2e tests that will be paired with this reg-service one that will verify that the Service is created.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry for the late reply,
@MatousJobanek @alexeykazakov Thanks for your comments and catching the missing host-repo changes. I did have the changes in host-repo, that's how i was able to test it locally indeed, but had missed creating the PR on host-repo. I've updated the host-operator and e2e PR in the description.

command:
- registration-service
imagePullPolicy: IfNotPresent
Expand Down Expand Up @@ -214,6 +215,24 @@ objects:
tls:
termination: edge
wildcardPolicy: None
- kind: Service
apiVersion: v1
metadata:
name: proxy-metrics-service
namespace: ${NAMESPACE}
labels:
provider: codeready-toolchain
run: registration-service
spec:
ports:
- name: "8082"
protocol: TCP
port: 80
targetPort: 8082
selector:
run: registration-service
type: ClusterIP
sessionAffinity: null
parameters:
- name: NAMESPACE
value: 'toolchain-host-operator'
Expand Down
34 changes: 34 additions & 0 deletions pkg/metrics/metrics.go
Original file line number Diff line number Diff line change
@@ -1,8 +1,13 @@
package metrics

import (
"github.com/labstack/echo/v4"
glog "github.com/labstack/gommon/log"
"github.com/prometheus/client_golang/prometheus"
"github.com/prometheus/client_golang/prometheus/promhttp"
"net/http"
logf "sigs.k8s.io/controller-runtime/pkg/log"
"time"
)

var log = logf.Log.WithName("registration_metrics")
Expand All @@ -12,6 +17,7 @@
MetricLabelRejected = "Rejected"
MetricsLabelVerbGet = "Get"
MetricsLabelVerbList = "List"
MetricsPort = "8082"
)

// histogram with labels
Expand Down Expand Up @@ -58,9 +64,37 @@
// RegisterCustomMetrics registers the custom metrics
func RegisterCustomMetrics() {
Reg = prometheus.NewRegistry()
if len(allHistogramVecs) == 0 {
log.Info("No Histograms to register")
}

Check warning on line 69 in pkg/metrics/metrics.go

View check run for this annotation

Codecov / codecov/patch

pkg/metrics/metrics.go#L68-L69

Added lines #L68 - L69 were not covered by tests
// register metrics
for _, v := range allHistogramVecs {
Reg.MustRegister(v)
}
log.Info("custom metrics registered")
}

//nolint:unparam
func prometheusHandler(ctx echo.Context) error {
h := promhttp.HandlerFor(Reg, promhttp.HandlerOpts{DisableCompression: true, Registry: Reg})
h.ServeHTTP(ctx.Response().Writer, ctx.Request())
return nil
}

func StartMetricsServer() *http.Server {
// start server
router := echo.New()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

router looks like a copy-paste zombie from the proxy code :-)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed

router.Logger.SetLevel(glog.INFO)
router.GET("/metrics", prometheusHandler)

log.Info("Starting the Registration-Service Metrics server...")
srv := &http.Server{Addr: ":" + MetricsPort, Handler: router, ReadHeaderTimeout: 2 * time.Second}
// listen concurrently to allow for graceful shutdown
go func() {
if err := srv.ListenAndServe(); err != nil {
log.Error(err, err.Error())
}

Check warning on line 96 in pkg/metrics/metrics.go

View check run for this annotation

Codecov / codecov/patch

pkg/metrics/metrics.go#L84-L96

Added lines #L84 - L96 were not covered by tests
}()

return srv

Check warning on line 99 in pkg/metrics/metrics.go

View check run for this annotation

Codecov / codecov/patch

pkg/metrics/metrics.go#L99

Added line #L99 was not covered by tests
}
28 changes: 28 additions & 0 deletions pkg/metrics/metrics_test.go
Original file line number Diff line number Diff line change
@@ -1,10 +1,13 @@
package metrics

import (
"github.com/labstack/echo/v4"
promtestutil "github.com/prometheus/client_golang/prometheus/testutil"
clientmodel "github.com/prometheus/client_model/go"
"github.com/stretchr/testify/assert"
"github.com/stretchr/testify/require"
"net/http"
"net/http/httptest"
"strings"
"testing"
"time"
Expand Down Expand Up @@ -64,6 +67,31 @@ func TestRegisterCustomMetrics(t *testing.T) {
}
}

func TestMetricsHandler(t *testing.T) {
// Create a request to pass to our handler. We don't have any query parameters, so we'll
// pass 'nil' as the third parameter.
req, err := http.NewRequest(http.MethodGet, "/metrics", nil)
require.NoError(t, err)

// Create handler instance.
RegisterCustomMetrics()

t.Run("valid metrics json", func(t *testing.T) {
// We create a ResponseRecorder (which satisfies http.ResponseWriter) to record the response.
e := echo.New()
rec := httptest.NewRecorder()
ctx := e.NewContext(req, rec)

//when
err := prometheusHandler(ctx)
require.NoError(t, err)
// then
// check the status code is what we expect, and the content-type
require.Equal(t, http.StatusOK, rec.Code)
require.Equal(t, "text/plain; version=0.0.4; charset=utf-8", rec.Header().Get("Content-Type"))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

does it make sense to verify that the body also contains the actual registered metric? you know, just check that the name of the metric is there, no need to verify the actual values...

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

removed

})
}

var expectedResponseMetadata = `
# HELP sandbox_test_histogram_vec test histogram description
# TYPE sandbox_test_histogram_vec histogram`
Expand Down
20 changes: 0 additions & 20 deletions pkg/proxy/handlers/metrics.go

This file was deleted.

37 changes: 0 additions & 37 deletions pkg/proxy/handlers/metrics_test.go

This file was deleted.

7 changes: 1 addition & 6 deletions pkg/proxy/proxy.go
Original file line number Diff line number Diff line change
Expand Up @@ -45,16 +45,14 @@ const (
bearerProtocolPrefix = "base64url.bearer.authorization.k8s.io." //nolint:gosec

proxyHealthEndpoint = "/proxyhealth"

pluginsEndpoint = "/plugins/"
pluginsEndpoint = "/plugins/"
)

type Proxy struct {
app application.Application
cl client.Client
tokenParser *auth.TokenParser
spaceLister *handlers.SpaceLister
metrics *handlers.Metrics
}

func NewProxy(app application.Application) (*Proxy, error) {
Expand All @@ -77,13 +75,11 @@ func newProxyWithClusterClient(app application.Application, cln client.Client) (

// init handlers
spaceLister := handlers.NewSpaceLister(app)
metrics := handlers.NewMetrics()
return &Proxy{
app: app,
cl: cln,
tokenParser: tokenParser,
spaceLister: spaceLister,
metrics: metrics,
}, nil
}

Expand Down Expand Up @@ -132,7 +128,6 @@ func (p *Proxy) StartProxy() *http.Server {
wg.GET("/:workspace", p.spaceLister.HandleSpaceGetRequest)
wg.GET("", p.spaceLister.HandleSpaceListRequest)
router.GET(proxyHealthEndpoint, p.health)
router.GET("/metrics", p.metrics.PrometheusHandler)
router.Any("/*", p.handleRequestAndRedirect)

// Insert the CORS preflight middleware
Expand Down
31 changes: 16 additions & 15 deletions pkg/proxy/proxy_test.go
Original file line number Diff line number Diff line change
Expand Up @@ -120,21 +120,6 @@ func (s *TestProxySuite) TestProxy() {
s.assertResponseBody(resp, `{"alive": true}`)
})

s.Run("check metrics not ok", func() {
req, err := http.NewRequest("GET", "http://localhost:8081/metrics", nil)
require.NoError(s.T(), err)
require.NotNil(s.T(), req)

// when
resp, err := http.DefaultClient.Do(req)

// then
require.NoError(s.T(), err)
require.NotNil(s.T(), resp)
defer resp.Body.Close()
assert.Equal(s.T(), 401, resp.StatusCode)
})

s.Run("plain http error", func() {
s.Run("unauthorized if no token present", func() {
req, err := http.NewRequest("GET", "http://localhost:8081/api/mycoolworkspace/pods", nil)
Expand Down Expand Up @@ -219,6 +204,22 @@ func (s *TestProxySuite) TestProxy() {
assert.Equal(s.T(), http.StatusInternalServerError, resp.StatusCode)
s.assertResponseBody(resp, "unable to get target cluster: some-error")
})

s.Run("internal error if accessing incorrect url", func() {
// given
req := s.request()
req.URL.Path = "http://localhost:8081/metrics"
require.NotNil(s.T(), req)

// when
resp, err := http.DefaultClient.Do(req)

// then
require.NoError(s.T(), err)
require.NotNil(s.T(), resp)
defer resp.Body.Close()
assert.Equal(s.T(), http.StatusInternalServerError, resp.StatusCode)
})
})

s.Run("websockets error", func() {
Expand Down