-
Notifications
You must be signed in to change notification settings - Fork 31
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Added option to use custom TLS certificates instead of ACME #17
Conversation
bed2fb7
to
a3d4a76
Compare
This looks amazing, I was also wondering if it would be possible to define an endpoint to pull certificate and key files from a local endpoint? At @ubicloud, we provide certificates in a local endpoint and customer is requested to pull these files and make use of them. I can also try to create a PR if you prefer. |
Do you mean like a local CA, or simply a secrets manager that returns the same certificate on every call? Edit: Found, https://www.ubicloud.com/docs/quick-start/using-kamal-with-ubicloud I think with the proposed |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for looking at this, @kpumuk! I agree it's something we need. I'd actually been planning to add it myself soon :)
I think the general approach here looks good. Agree that adding tests is a good idea. I've also left a few comments about some specifics. If you want to address those, great. I'd also be happy to take it from here and finish it up if you prefer, as I have some other (unrelated) changes to make around the TLS handling that will probably touch on this too.
internal/cmd/deploy.go
Outdated
tlsCertificatePath string | ||
tlsPrivateKeyPath string |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can use the members of the ServiceOptions
directly, so don't need to add these here. Similar to how we're handling the ErrorPagePath
.
(The other TLS variables here are because they don't exactly match the format that's in the ServiceOptions
struct, so they get handled a bit differently).
internal/cmd/deploy.go
Outdated
@@ -33,6 +35,8 @@ func newDeployCommand() *deployCommand { | |||
|
|||
deployCommand.cmd.Flags().BoolVar(&deployCommand.tls, "tls", false, "Configure TLS for this target (requires a non-empty host)") | |||
deployCommand.cmd.Flags().BoolVar(&deployCommand.tlsStaging, "tls-staging", false, "Use Let's Encrypt staging environment for certificate provisioning") | |||
deployCommand.cmd.Flags().StringVar(&deployCommand.tlsCertificatePath, "tls-certificate-path", "", "") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So as above, these can refer to the struct fields like:
&deployCommand.args.ServiceOptions.TLSCertificatePath
internal/cmd/deploy.go
Outdated
if cmd.Flags().Changed("tls-certificate-path") && !cmd.Flags().Changed("tls-private-key-path") { | ||
return fmt.Errorf("tls-private-key-path must be set when specified tls-certificate-path") | ||
} | ||
|
||
if cmd.Flags().Changed("tls-private-key-path") && !cmd.Flags().Changed("tls-certificate-path") { | ||
return fmt.Errorf("tls-certificate-path must be set when specified tls-private-key-path") | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could use MarkFlagsRequiredTogether here instead of checking them manually, as it's a case that Cobra can handle. (Similar to how we're using MarkFlagsOneRequired
here).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I initially tried MarkFlagsRequiredTogether
, but the error was looking pretty weird:
Error: if any flags in the group [tls-certificate-path tls-private-key-path] are set they must all be set; missing [tls-private-key-path]
Question of usability?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Updated the branch
"tls-private-key-path", m.tlsPrivateKeyFilePath, | ||
) | ||
|
||
cert, err := tls.LoadX509KeyPair(m.tlsCertificateFilePath, m.tlsPrivateKeyFilePath) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This will be called for every TLS handshake, which means the certificate files will be loaded multiple times. Better to load the certificate when the StaticCertManager
is established, and reuse it for the lifetime of the deployment.
That would also let us catch a loading error early on, and fail the deployment at that point, which would be safer.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That would require more changes, as we right now do not expect createCertManager()
to ever return an error.
Another option would be to cache the certificate on first use:
diff --git a/internal/server/cert.go b/internal/server/cert.go
index bb1580d..afaf740 100644
--- a/internal/server/cert.go
+++ b/internal/server/cert.go
@@ -9,9 +9,11 @@ type CertManager interface {
GetCertificate(hello *tls.ClientHelloInfo) (*tls.Certificate, error)
}
+// StaticCertManager is a certificate manager that loads certificates from disk.
type StaticCertManager struct {
tlsCertificateFilePath string
tlsPrivateKeyFilePath string
+ cert *tls.Certificate
}
func NewStaticCertManager(tlsCertificateFilePath, tlsPrivateKeyFilePath string) *StaticCertManager {
@@ -22,6 +24,10 @@ func NewStaticCertManager(tlsCertificateFilePath, tlsPrivateKeyFilePath string)
}
func (m *StaticCertManager) GetCertificate(*tls.ClientHelloInfo) (*tls.Certificate, error) {
+ if m.cert != nil {
+ return m.cert, nil
+ }
+
slog.Info(
"Loading custom TLS certificate",
"tls-certificate-path", m.tlsCertificateFilePath,
@@ -32,6 +38,7 @@ func (m *StaticCertManager) GetCertificate(*tls.ClientHelloInfo) (*tls.Certifica
if err != nil {
return nil, err
}
+ m.cert = &cert
- return &cert, nil
+ return m.cert, nil
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Applied the patch and added a test. Let me know what you think
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thinking about this, might need a mutex since there might be multiple procs loading certs at the same time.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Confirmed, will work on a fix:
manager := NewStaticCertManager(certPath, keyPath)
go func() {
manager.GetCertificate(&tls.ClientHelloInfo{})
}()
cert, err := manager.GetCertificate(&tls.ClientHelloInfo{})
Output of go test ./... -race
:
WARNING: DATA RACE
Write at 0x00c0001e8e00 by goroutine 23:
github.com/basecamp/kamal-proxy/internal/server.(*StaticCertManager).GetCertificate()
/Users/dmytro/work/github/kamal-proxy/internal/server/cert.go:41 +0x254
github.com/basecamp/kamal-proxy/internal/server.TestCertificateLoadingRaceCondition.func1()
/Users/dmytro/work/github/kamal-proxy/internal/server/cert_test.go:49 +0x7c
Previous write at 0x00c0001e8e00 by goroutine 22:
github.com/basecamp/kamal-proxy/internal/server.(*StaticCertManager).GetCertificate()
/Users/dmytro/work/github/kamal-proxy/internal/server/cert.go:41 +0x254
github.com/basecamp/kamal-proxy/internal/server.TestCertificateLoadingRaceCondition()
/Users/dmytro/work/github/kamal-proxy/internal/server/cert_test.go:51 +0x270
testing.tRunner()
/Users/dmytro/go/pkg/mod/golang.org/toolchain@v0.0.1-go1.23.1.darwin-arm64/src/testing/testing.go:1690 +0x184
testing.(*T).Run.gowrap1()
/Users/dmytro/go/pkg/mod/golang.org/toolchain@v0.0.1-go1.23.1.darwin-arm64/src/testing/testing.go:1743 +0x40
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Introduced an RWMutex
in f1df35f to address it
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Another option to avoid the mutex would be to attempt the file load in the NewStaticCertManager
, and then save both the error and the response in the struct. GetCertificate
would just return both, avoiding a delayed load, the need for mutex, etc.
This design might limit the options for (potential) future changes, like reloading the certificate on file change.
@kpumuk yes, I was specifically asking to eliminate the step https://www.ubicloud.com/docs/quick-start/using-kamal-with-ubicloud#step-9 :) It would make life much easier. It makes sense that this is out of scope for this PR. I'll work on it and hopefully it's something useful to the community. |
I don't think the proxy should be responsible for this. Using a Kamal hook seems like a better option to me. Or, if it turned out to be a common enough requirement, then perhaps it could even be a feature on the Kamal side. But I'd rather the proxy not be involved with gathering artefacts like this; they should be supplied to it wherever possible. |
daec437
to
8eb3379
Compare
@kevinmcconnell seems like the documentation change was merged ahead of the functional change :-) Do you want to change the initialization interface in this MR to return the error, or keep the mutex and refactor later? Edit: might have been merged after this thread. |
Benchmark WITHOUT mutex:
Benchmark WITH mutex:
So the only concern is about ergonomics — when the error is shown for inaccessible cert files - during deploy or during the first request. |
… test to confirm it addressed
…TP-01 challenges was added
@kevinmcconnell I have rebased the branch on top of recent changes, which required to add This requires two changes on
What do you think? Any ideas on how to simplify the setup? Edit: Realized that proxy is shared and so might not be the brightest idea if every service mounted a volume .into the container... Edit 2: Another option is to copy certs directly to the kamal-proxy config volume: #!/bin/sh
set -euo pipefail
KAMAL_PROXY_TLS_CERT=$(op read "op://Private/Kamal Demo/cert.pem")
KAMAL_PROXY_TLS_PRIVATE_KEY=$(op read "op://Private/Kamal Demo/key.pem")
for ip in ${KAMAL_HOSTS//,/ }; do
ssh -q -T -o BatchMode=yes ubuntu@"${ip}" bash --noprofile <<-EOF
docker run --rm --volume kamal-proxy-config:/home/kamal-proxy/.config/kamal-proxy ubuntu:noble-20240801 sh -c "
mkdir -p /home/kamal-proxy/.config/kamal-proxy/certs
echo '${KAMAL_PROXY_TLS_CERT}' > /home/kamal-proxy/.config/kamal-proxy/certs/cert.pem
echo '${KAMAL_PROXY_TLS_PRIVATE_KEY}' > /home/kamal-proxy/.config/kamal-proxy/certs/key.pem
"
EOF
done This can be used to improve ergonomics of the command, and in addition to low-level API to pass keys directly, Kamal can do something like: proxy:
ssl: true
ssl_certificate: <%= KAMAL_PROXY_TLS_CERT %>
ssl_private_key: <%= KAMAL_PROXY_TLS_PRIVATE_KEY %> This way the values can come from a secret manager, and Kamal can store them in |
This all looks great, @kpumuk! I have an idea in mind for how to handle the errors earlier in the deploy process as we discussed, but that also has a few knock-on effects I want to consider. So I'll get this PR merged first as-is, and then I'll explore that separately. It doesn't need to hold up this feature.
Yes, agreed. Using the existing mounted volume makes sense, and we can have Kamal scope it to the service name to avoid conflicts. (We'll need to do something similar for the |
Thanks for your work on this, @kpumuk! 🙏 |
I've reverted that documentation change for now. We'd need to make it differently anyway as configuration docs are generated from |
In some cases it is needed to use a custom TLS certificate:
With Traefic we had an option to specify a custom certificate, and so this change brings this feature into kamal-proxy.
PS. I am still looking into adding tests, wanted to gauge opinions whether this is a feature you are interested in.covered by tests now
Related kamal changes