Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Host cluster-local-domain-tls on cluster-local gateway with SNI #1228

Merged
merged 2 commits into from
Feb 6, 2024

Conversation

ReToCode
Copy link
Member

@ReToCode ReToCode commented Dec 19, 2023

Changes

  • Add cluster-local https servers when cluster-local-domain-tls is enabled and a KSVC exists
  • Uses SNI to host multiple certificates for all cluster-local domain Knative Services

/kind enhancement

Fixes knative/serving#14374

Release Note

net-istio now hosts an additional gateway with TLS for each Knative Service when `cluster-local-domain-tls` is enabled. Note: this is an experimental alpha-feature.

Docs

Will be done once the features is complete

@knative-prow knative-prow bot added do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. kind/enhancement do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. labels Dec 19, 2023
Copy link

knative-prow bot commented Dec 19, 2023

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: ReToCode

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@knative-prow knative-prow bot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Dec 19, 2023
@knative-prow knative-prow bot requested review from KauzClay and nak3 December 19, 2023 13:10
@knative-prow knative-prow bot added the size/L Denotes a PR that changes 100-499 lines, ignoring generated files. label Dec 19, 2023
Copy link

codecov bot commented Dec 19, 2023

Codecov Report

Attention: 22 lines in your changes are missing coverage. Please review.

Comparison is base (fadb996) 81.84% compared to head (4cbe505) 81.12%.
Report is 18 commits behind head on main.

Files Patch % Lines
pkg/reconciler/ingress/ingress.go 60.86% 10 Missing and 8 partials ⚠️
pkg/reconciler/ingress/resources/gateway.go 84.61% 4 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main    #1228      +/-   ##
==========================================
- Coverage   81.84%   81.12%   -0.72%     
==========================================
  Files          19       19              
  Lines        1680     1706      +26     
==========================================
+ Hits         1375     1384       +9     
- Misses        218      230      +12     
- Partials       87       92       +5     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@ReToCode ReToCode changed the title [wip] Host cluster-local-domain TLS on local listener with SNI [wip] Host cluster-local-domain-tls on cluster-local gateway with SNI Dec 19, 2023
@knative-prow knative-prow bot added size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. and removed size/L Denotes a PR that changes 100-499 lines, ignoring generated files. labels Dec 20, 2023
@ReToCode
Copy link
Member Author

Did some testing with Serving @ knative/serving#14610, looks good. Test setup here.

@skonto PTAL. Note: I'm fine with waiting until Serving is merged, then we can add this here and also do the full test-set with with net-istio in Serving.

@ReToCode ReToCode changed the title [wip] Host cluster-local-domain-tls on cluster-local gateway with SNI Host cluster-local-domain-tls on cluster-local gateway with SNI Dec 21, 2023
@knative-prow knative-prow bot removed the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Dec 21, 2023
@ReToCode
Copy link
Member Author

/unhold

Serving PR has been merged, @skonto PTAL.

@knative-prow knative-prow bot removed the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Jan 26, 2024
@skonto skonto requested review from skonto and removed request for nak3 January 31, 2024 11:56
@@ -55,3 +55,6 @@ spec:
- name: http2
port: 80
targetPort: 8081
- name: https
port: 443
targetPort: 8443
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To confirm my understanding - the two istio gateways we have actually use the same envoy instance but we use different ports for tenancy

For external ingress we open

  • port 80
  • port 443 dynamically when we have TLS certs

For internal traffic we have

  • port 8081 (this service exposes it on port 80)
  • port 8443 (this service exposes it on port 443)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see 443 server being added here for external traffic:

Name: fmt.Sprintf(portNamePrefix(ing.GetNamespace(), ing.GetName())+":%d", i),
Number: 443,
Protocol: "HTTPS",

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But I don't see us adding TLS servers on port 8443 for cluster local traffic

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unfortunately istio is quite confusing in that config:

istioctl proxy-config listeners -n istio-system deploy/istio-ingressgateway
ADDRESSES PORT  MATCH                                                                            DESTINATION
0.0.0.0   8080  ALL                                                                              Route: http.8080
0.0.0.0   8081  ALL                                                                              Route: http.8081
0.0.0.0   8443  SNI: helloworld.second,helloworld.second.svc,helloworld.second.svc.cluster.local Route: https.443.second/helloworld:0.helloworld-975966116.second
0.0.0.0   8443  SNI: helloworld.second.172.17.0.100.sslip.io                                     Route: https.443.second/helloworld:0.helloworld-3797421420.second
0.0.0.0   8443  SNI: helloworld.first,helloworld.first.svc,helloworld.first.svc.cluster.local    Route: https.443.first/helloworld:0.helloworld-975966116.first
0.0.0.0   8443  SNI: helloworld.first.172.17.0.100.sslip.io                                      Route: https.443.first/helloworld:0.helloworld-3797421420.first
0.0.0.0   15021 ALL                                                                              Inline Route: /healthz/ready*
0.0.0.0   15090 ALL                                                                              Inline Route: /stats/prometheus*

So basically, everything https will be on 8443 and routed to 443 to the Knative Service.

  • I'm not even sure if we would need to specify the port or not, but that 443 just maps to the containers 8443 port.

Copy link
Member Author

@ReToCode ReToCode Feb 1, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I assume we could use a completely different port for the internal TLS traffic, but not sure what we would gain. As you said, it's a shared component without tenancy anyway.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's using tenancy via port isolation so external traffic cannot hit the 'local' gateway ports.

Essentially these ports are public - so we should use a different port for internal tls - otherwise we risk exposing private Knative Services to the internet.

- name: status-port
port: 15021
protocol: TCP
targetPort: 15021
- name: http2
port: 80
protocol: TCP
targetPort: 8080
- name: https
port: 443
protocol: TCP
targetPort: 8443
selector:
app: istio-ingressgateway
istio: ingressgateway
type: LoadBalancer

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe this can be configurable after this lands - #1249

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Essentially these ports are public - so we should use a different port for internal tls - otherwise we risk exposing private Knative Services to the internet.

That's a valid point. I'll take a look tomorrow.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I updated the PR to use a dedicated port for cluster-local-tls (8444). With that I get the following envoy config:

istioctl proxy-config listeners -n istio-system deploy/istio-ingressgateway
ADDRESSES PORT  MATCH                                                                               DESTINATION
0.0.0.0   8080  ALL                                                                                 Route: http.8080
0.0.0.0   8081  ALL                                                                                 Route: http.8081
0.0.0.0   8443  SNI: helloworld.default.172.17.0.100.sslip.io                                       Route: https.443.default/helloworld:0.helloworld-3797421420.default
0.0.0.0   8444  SNI: helloworld.default,helloworld.default.svc,helloworld.default.svc.cluster.local Route: https.8444.default/helloworld:0.helloworld-975966116.default
0.0.0.0   15021 ALL                                                                                 Inline Route: /healthz/ready*
0.0.0.0   15090 ALL                                                                                 Inline Route: /stats/prometheus*

As 8444 is not exposed, it is now only available cluster locally.

@skonto
Copy link
Contributor

skonto commented Feb 2, 2024

As discussed with Reto the landing page of this repo is not ideal. There is no doc explaining what is supported and how. For example here we are moving to a design choice of having one gateway per svc for cluster local stuff but it is not captured somewhere etc. I think we need to spend some time adding content here but also to other net-* repos.
It is kind of hard for newcomers to understand what is happening.

@@ -57,10 +57,9 @@ import (
)

const (
virtualServiceConditionReconciled = "Reconciled"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1 this is a relic of some functionality never fixed at the istio side.

}

func (r *Reconciler) reconcileDeletion(ctx context.Context, ing *v1alpha1.Ingress) error {
if !shouldReconcileTLS(ing) {
func (r *Reconciler) cleanupCertificateSecrets(ctx context.Context, ing *v1alpha1.Ingress) error {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe deleteAllDomainCertificateSecrets.

ingressGateways := []*v1beta1.Gateway{}
if shouldReconcileTLS(ing) {
originSecrets, err := resources.GetSecrets(ing, r.secretLister)
externalIngressGateways := []*v1beta1.Gateway{}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

var externalIngressGateways []*v1beta1.Gateway or we want an empty slice instead?

@skonto
Copy link
Contributor

skonto commented Feb 2, 2024

minor comments, LGTM. We can stamp if @dprotaso has no other comments.

@dprotaso
Copy link
Contributor

dprotaso commented Feb 6, 2024

/lgtm

As discussed with Reto the landing page of this repo is not ideal. There is no doc explaining what is supported and how. For example here we are moving to a design choice of having one gateway per svc for cluster local stuff but it is not captured somewhere etc. I think we need to spend some time adding content here but also to other net-* repos.
It is kind of hard for newcomers to understand what is happening.

Yeah that's a good point - want to create an issue to capture this work?

@knative-prow knative-prow bot added the lgtm Indicates that a PR is ready to be merged. label Feb 6, 2024
@knative-prow knative-prow bot merged commit 83db165 into knative-extensions:main Feb 6, 2024
48 checks passed
ReToCode added a commit to ReToCode/net-istio that referenced this pull request Feb 7, 2024
…ive-extensions#1228)

* Host cluster-local-domain TLS on local listener with SNI

* Use port 8444 for cluster-local TLS traffic

(cherry picked from commit 83db165)
openshift-merge-bot bot pushed a commit to openshift-knative/net-istio that referenced this pull request Feb 7, 2024
* Host cluster-local-domain-tls on cluster-local gateway with SNI (knative-extensions#1228)

* Host cluster-local-domain TLS on local listener with SNI

* Use port 8444 for cluster-local TLS traffic

(cherry picked from commit 83db165)

* Update patch to reflect upstream changes
@ReToCode ReToCode added this to the v1.14.0 milestone Apr 29, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. kind/enhancement lgtm Indicates that a PR is ready to be merged. size/XL Denotes a PR that changes 500-999 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[cluster-local-domain-tls] host cluster-local certificates with SNI in net-istio
3 participants