Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

400: Bad request; unsupported request method. Can't connect the Pipelines SDK to Kubeflow Pipelines with Kind for Kubeflow 1.9 #2844

Closed
4 of 7 tasks
ESKYoung opened this issue Aug 17, 2024 · 8 comments
Assignees

Comments

@ESKYoung
Copy link

Validation Checklist

Version

1.9

Describe your issue

After deploying Kubeflow 1.9 onto the Kind cluster as detailed in the README, I'm having trouble submitting an example pipeline from the documentation via the Python SDK using the "Full Kubeflow (from outside cluster)" guide. I'm running this locally on an Intel iMac on macOS 13 with Docker Desktop 4.33.0.

I can manually upload the pipeline to the Central Dashboard, and it runs successfully. However, when I run through the "Full Kubeflow (from outside cluster)" guide, listing pipelines/experiments returns:

{'next_page_token': None, 'pipelines': None, 'total_size': None}

even though there's an existing pipeline/experiment that was manually generated. The "Full Kubeflow (from outside cluster)" docs don't include a namespace, but adding this (kubeflow-user-example-com) has the same effect.

As soon as I switch to kfp.Client.create_run_from_pipeline_package I get the following 400 message:

kfp_server_api.exceptions.ApiException: (400)
Reason: Bad Request
HTTP response headers: HTTPHeaderDict({'date': 'Sat, 17 Aug 2024 06:27:21 GMT', 'content-length': '767', 'content-type': 'text/html; charset=utf-8', 'x-envoy-upstream-service-time': '3', 'server': 'istio-envoy'})
HTTP response body: <!DOCTYPE html>
<html>
  <head>
    <meta charset="utf-8">
    <meta http-equiv="X-UA-Compatible" content="IE=edge,chrome=1">
    <title>dex</title>
    <meta name="viewport" content="width=device-width, initial-scale=1.0">
    <link href="../static/main.css" rel="stylesheet">
    <link href="../theme/styles.css" rel="stylesheet">
    <link rel="icon" href="../theme/favicon.png">
  </head>

  <body class="theme-body">
    <div class="theme-navbar">
      <div class="theme-navbar__logo-wrap">
        <img class="theme-navbar__logo" src="../theme/logo.png">
      </div>
    </div>

    <div class="dex-container">


<div class="theme-panel">
  <h2 class="theme-heading">Bad Request</h2>
  <p>Unsupported request method.</p>
</div>

    </div>
  </body>
</html>
</pre>
</details>

All pods are running, but occasionally I see the kubeflow-m2m-oidc-configurator appear, go into CrashLoopBackOff, and then disappear; its logs are:

Wait until resource RequestAuthentication m2m-token-issuer in namespace istio-system is ready.
Resource RequestAuthentication m2m-token-issuer in namespace istio-system is ready.
Patch RequestAuthentication with JWKS if required.
Getting RequestAuthentication object.
Issuer Url in RequestAuthentication: https://kubernetes.default.svc.cluster.local
Current Jwks (escaped):

Jwks Uri from Well Known OpenID Configuration:
curl: option --url: blank argument where content is expected
curl: try 'curl --help' or 'curl --manual' for more information

Steps to reproduce the issue

  1. Install Kind

  2. Install Kubeflow 1.9 with a single command

  3. Wait for all pods to be in a running state

  4. Connect to the cluster using the port forwarding command listed in the README

  5. Login using the default username (user@example.com), and password (12341234), and verify the Central Dashboard looks good

  6. Run a slightly modified (includes namespace) version of the "Full Kubeflow (from outside cluster)" guide using the example pipeline:

    from kfp import dsl, compiler
    from kfp.client import Client
    
    import re
    import requests
    from urllib.parse import urlsplit
    
    
    def get_istio_auth_session(url: str, username: str, password: str) -> dict:
        """
        Determine if the specified URL is secured by Dex and try to obtain a session cookie.
        WARNING: only Dex `staticPasswords` and `LDAP` authentication are currently supported
                 (we default default to using `staticPasswords` if both are enabled)
    
        :param url: Kubeflow server URL, including protocol
        :param username: Dex `staticPasswords` or `LDAP` username
        :param password: Dex `staticPasswords` or `LDAP` password
        :return: auth session information
        """
        # define the default return object
        auth_session = {
            "endpoint_url": url,  # KF endpoint URL
            "redirect_url": None,  # KF redirect URL, if applicable
            "dex_login_url": None,  # Dex login URL (for POST of credentials)
            "is_secured": None,  # True if KF endpoint is secured
            "session_cookie": None
            # Resulting session cookies in the form "key1=value1; key2=value2"
        }
    
        # use a persistent session (for cookies)
        with requests.Session() as s:
    
            ################
            # Determine if Endpoint is Secured
            ################
            resp = s.get(url, allow_redirects=True)
            if resp.status_code != 200:
                raise RuntimeError(
                    f"HTTP status code '{resp.status_code}' for GET against: {url}"
                )
    
            auth_session["redirect_url"] = resp.url
    
            # if we were NOT redirected, then the endpoint is UNSECURED
            if len(resp.history) == 0:
                auth_session["is_secured"] = False
                return auth_session
            else:
                auth_session["is_secured"] = True
    
            ################
            # Get Dex Login URL
            ################
            redirect_url_obj = urlsplit(auth_session["redirect_url"])
    
            # if we are at `/auth?=xxxx` path, we need to select an auth type
            if re.search(r"/auth$", redirect_url_obj.path):
                #######
                # TIP: choose the default auth type by including ONE of the following
                #######
    
                # OPTION 1: set "staticPasswords" as default auth type
                redirect_url_obj = redirect_url_obj._replace(
                    path=re.sub(r"/auth$", "/auth/local", redirect_url_obj.path)
                )
                # OPTION 2: set "ldap" as default auth type
                # redirect_url_obj = redirect_url_obj._replace(
                #     path=re.sub(r"/auth$", "/auth/ldap", redirect_url_obj.path)
                # )
    
            # if we are at `/auth/xxxx/login` path, then no further action is needed (we can use it for login POST)
            if re.search(r"/auth/.*/login$", redirect_url_obj.path):
                auth_session["dex_login_url"] = redirect_url_obj.geturl()
    
            # else, we need to be redirected to the actual login page
            else:
                # this GET should redirect us to the `/auth/xxxx/login` path
                resp = s.get(redirect_url_obj.geturl(), allow_redirects=True)
                if resp.status_code != 200:
                    raise RuntimeError(
                        f"HTTP status code '{resp.status_code}' for GET against: {redirect_url_obj.geturl()}"
                    )
    
                # set the login url
                auth_session["dex_login_url"] = resp.url
    
            ################
            # Attempt Dex Login
            ################
            resp = s.post(
                auth_session["dex_login_url"],
                data={"login": username, "password": password},
                allow_redirects=True
            )
            if len(resp.history) == 0:
                raise RuntimeError(
                    f"Login credentials were probably invalid - "
                    f"No redirect after POST to: {auth_session['dex_login_url']}"
                )
    
            # store the session cookies in a "key1=value1; key2=value2" string
            auth_session["session_cookie"] = "; ".join(
                [f"{c.name}={c.value}" for c in s.cookies])
    
        return auth_session
    
    
    @dsl.component
    def say_hello(name: str) -> str:
        hello_text = f'Hello, {name}!'
        print(hello_text)
        return hello_text
    
    
    @dsl.pipeline
    def hello_pipeline(recipient: str) -> str:
        hello_task = say_hello(name=recipient)
        return hello_task.output
    
    
    compiler.Compiler().compile(hello_pipeline, 'pipeline.yaml')
    
    KUBEFLOW_ENDPOINT = "http://localhost:8080"
    KUBEFLOW_USERNAME = "user@example.com"
    KUBEFLOW_PASSWORD = "12341234"
    
    auth_session = get_istio_auth_session(
        url=KUBEFLOW_ENDPOINT,
        username=KUBEFLOW_USERNAME,
        password=KUBEFLOW_PASSWORD
    )
    
    client = Client(host=f"{KUBEFLOW_ENDPOINT}/pipeline", cookies=auth_session["session_cookie"], namespace="kubeflow-user-example-com")
    print(client.list_experiments(namespace="kubeflow-user-example-com"))
    run = client.create_run_from_pipeline_package(
        'pipeline.yaml',
        arguments={
            'recipient': 'World',
        },
    )

    This should fail as described.

  7. Manually upload pipeline.yaml generated from Step 6, and run it in the Central Dashboard — this should work

  8. Re-run Step 6; this should still fail, and client.list_experiments should still return with Nones despite Step 7 creating an experiment.

Put here any screenshots or videos (optional)

No response

@juliusvonkohout
Copy link
Member

juliusvonkohout commented Aug 18, 2024

Hello, the website documentation is outdated. CC @diegolovison

Please check the following

The instructions are in

  1. https://github.com/kubeflow/manifests/blob/master/.github/workflows/pipeline_run_from_notebook.yaml for jupyterlab in your namespace (inside cluster) to KFP Apiserver.
    2.. The instructions for internal AND external access with the same power as the webinterface via oauth2-proxy and tokens are here for KFP onsole.cloud.google.com/ here for workbenches/notebooks https://github.com/kubeflow/manifests/blob/master/.github/workflows/notebook_controller_m2m_test.yaml and here for Kserve https://github.com/kubeflow/manifests/blob/master/.github/workflows/kserve_m2m_test.yaml It might be enough to just explain the general approach for KFP.

Afterwards all insecure/dangerous username+password based programatic access documentation should be removed. These are anyway only hacky workarounds that try to emulate a webbrowser instead of providing a proper solution @diego Lovison.

@thesuperzapper
Copy link
Member

@juliusvonkohout I just want to say, there is nothing inherently wrong with using a username and password as a token.

There are many situations in which using a full JWT (either from the Kubernetes API server, or any other issuer trusted by oauth2-proxy), may not be feasible, or worth the effort.

@juliusvonkohout
Copy link
Member

juliusvonkohout commented Aug 19, 2024

@thesuperzapper what do you propose then?

It is still a hacky script that tries to use user sessions for machine to machine communication. We now have oauth2-proxy as a proper solution.
I want to rely on aouth2-proxy also because some users replace Dex or there is two factor authentication for user sessions.

@thesuperzapper
Copy link
Member

thesuperzapper commented Aug 19, 2024

@thesuperzapper what do you propose then?

It is still a hacky script that tries to use user sessions for machine to machine communication. We now have oauth2-proxy as a proper solution. I want to rely on aouth2-proxy also because some users replace Dex or there is two factor authentication for user sessions.

@juliusvonkohout The most logical thing is to show how to use both options on the website.

Either way, in the immediate future we need to provide a fix for users who are using the Dex flow, as the current script no longer works because we introduced an OIDC "approve grant" page into the login flow. Luckily, I have a new script ready that will work for 1.9.0, and we are proposing to add it to the website in:

PS: we should remove the strange "grant prompt" in 1.9.1 because it makes no sense as we control both oauth2-proxy and dex, we can do this by setting prompt = "none" in the oauth2-proxy configs.

PSS: If users just want to try out the updated code, see the preview docs site here.

@juliusvonkohout
Copy link
Member

@thesuperzapper what do you propose then?

It is still a hacky script that tries to use user sessions for machine to machine communication. We now have oauth2-proxy as a proper solution. I want to rely on aouth2-proxy also because some users replace Dex or there is two factor authentication for user sessions.

@juliusvonkohout The most logical thing is to show how to use both options on the website.

Either way, in the immediate future we need to provide a fix for users who are using the Dex flow, as the current script no longer works because we introduced an OIDC "approve grant" page into the login flow. Luckily, I have a new script ready that will work for 1.9.0, and we are proposing to add it to the website in:

PS: we should remove the strange "grant prompt" in 1.9.1 because it makes no sense as we control both oauth2-proxy and dex, we can do this by setting prompt = "none" in the oauth2-proxy configs.

PSS: If users just want to try out the updated code, see the preview docs site here.

Sounds like a plan. CC @kromanow94

@ESKYoung
Copy link
Author

Thanks @juliusvonkohout, and @thesuperzapper for the useful discussions, and pointers!

PSS: If users just want to try out the updated code, see the preview docs site here.

I've tried this out, and it works like a charm — many thanks!

@juliusvonkohout
Copy link
Member

Alright then i will close it after kubeflow/website#3795, since i somehow cannot link them.

@juliusvonkohout juliusvonkohout self-assigned this Aug 21, 2024
@juliusvonkohout
Copy link
Member

The PR has been merged :-)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants