Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adds SAML SP #3

Merged
merged 6 commits into from
Apr 5, 2024
Merged
Show file tree
Hide file tree
Changes from 5 commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 4 additions & 0 deletions .github/workflows/ci.yml
Original file line number Diff line number Diff line change
Expand Up @@ -14,6 +14,10 @@ jobs:
uses: actions/setup-python@v5
with:
python-version: 3.12
- name: apt update
run: sudo apt update
- name: Install libxmlsec1
run: sudo apt-get install -y libxmlsec1-dev
- name: Install dependencies
run: |
python -m pip install --upgrade pip pipenv
Expand Down
1 change: 1 addition & 0 deletions Aptfile
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
libxmlsec1-dev
17 changes: 17 additions & 0 deletions Dockerfile
Original file line number Diff line number Diff line change
@@ -0,0 +1,17 @@
# syntax=docker/dockerfile:1
FROM python:3.12

RUN pip install --no-cache-dir --upgrade pip pipenv

RUN apt-get update && apt-get upgrade -y && apt-get install -y git libxmlsec1-dev

ENV FLASK_RUN_HOST=0.0.0.0
ENV FLASK_ENV="development"

WORKDIR /app
COPY . .
RUN pipenv install --dev --system --clear --deploy

CMD ["make", "run-dev"]

EXPOSE 5000
30 changes: 25 additions & 5 deletions Makefile
Original file line number Diff line number Diff line change
Expand Up @@ -16,26 +16,30 @@ update: install ## update all Python dependencies
## ---- Unit test commands ---- ##

test: ## run tests and print a coverage report
pipenv run coverage run --source=cdnauth -m pytest -vv
FLASK_ENV=testing pipenv run coverage run --source=cdnauth -m pytest -vv
pipenv run coverage report -m
pipenv run coverage html

coveralls: test
pipenv run coverage lcov -o ./coverage/lcov.info

## ---- Code quality and safety commands ---- ##

# linting commands
lint: black mypy ruff safety ## run all linters
lint: black ruff safety ## run all linters

black:
pipenv run black --check --diff .

mypy:
pipenv run mypy .
black-apply: # apply changes with 'black'
pipenv run black .

ruff:
pipenv run ruff check .

ruff-apply: # resolve 'fixable errors' with 'ruff'
pipenv run ruff check --fix .

safety:
pipenv check
pipenv verify
Expand All @@ -45,6 +49,22 @@ safety:

run-dev: ## run the flask app in dev
FLASK_ENV=development pipenv run flask --app cdnauth run --debug
# example of how to use SSL in dev. Very useful if you override /etc/hosts to have localhost
# act as touchstone registered SP
# FLASK_ENV=development pipenv run flask --app cdnauth run --debug --cert=adhoc


run-prod: ## run the flask app in a prod-like mode
FLASK_ENV=production pipenv run gunicorn cdnauth:app --log-file -
FLASK_ENV=production pipenv run gunicorn --bind 0.0.0.0 cdnauth:app --log-level debug --log-file -


## ---- Useful commands for managing the app when using a container for dev ---- ##

build: ## build local container
docker build -t cdnauth-local .

app-bash: build ## bash shell in app container with linked file system to local directory
docker run -it -v.:/app -p 5000:5000 -p 8000:8000 --env-file .env cdnauth-local bash
# example of how to use SSL in dev. Very useful if you override /etc/hosts to have localhost
# act as touchstone registered SP
# docker run -it -v.:/app -p 443:5000 --env-file .env cdnauth-local bash
4 changes: 3 additions & 1 deletion Pipfile
Original file line number Diff line number Diff line change
Expand Up @@ -6,10 +6,12 @@ name = "pypi"
[packages]
flask = "*"
gunicorn = "*"
python3-saml = "*"
lxml = "==4.9.4"
pyjwt = "*"

[dev-packages]
black = "*"
mypy = "*"
ruff = "*"
safety = "*"
pytest = "*"
Expand Down
746 changes: 575 additions & 171 deletions Pipfile.lock

Large diffs are not rendered by default.

112 changes: 105 additions & 7 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -40,17 +40,115 @@ Not all links are publicly accessible.

## How to run this app locally

- Pre-requisite: python 3.12 (if you use `asdf`, `asdf install python 3.12.1`)
- switch to python 3.12 (`asdf shell python 3.12.1` or equivalent)
- confirm you are using python 3.12 (`python --version`)
- pipenv in your python 3.12 install: `pip install pipenv`
- note: failing to do this may not result in a noticeable problem, but this is still good practice as it will ensure you are using a `pipenv` within the correct version of python instead of one from your default version. Do this every time you install a new python version.
- note: a python virtual environment will be created via `pipenv` when following the next steps if one does not yet exist. If you run into issues at any point, you can remove that virtualenv with `pipenv --rm` from the project root and start over.
This application expects to be developed in docker. [There are dependency issues with libxmlxec1 on macos](https://github.com/SAML-Toolkits/python3-saml/issues/356) at this time and rather than each of us fighting with that, we will develop inside the docker container so it can be solved programatically and repeatably.
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Before I read this and the make app-bash approach, I struggled to get things installed as well. Ultimately I did get it installed, where this github issue workaround worked well for me.

Related, I also needed to pip install setuptools, as it sounds like disutils does not ship with python 3.12+.

With those two things done, make install worked as expected, and I'm able to run make run-dev and fire up the flask app locally (outside of a docker container).

All that said, I think the docker approach also works well! Nice to have options for developer's choice.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Glad (sort of) you were able to get it running locally in addition to trying out the docker approach. The upstream requirement issues are starting to resolve themselves and this should be much more easily installable locally soon(TM).


Additionally, `lxml` versions newer than 4.9.4 [crash python3-saml](https://github.com/SAML-Toolkits/python3-saml/issues/389)

### General development workflow

- Pre-requisite: docker

- `make app-bash` will build a new container and drop you into a shell. This will be your main interaction point from which you will run other commands. Code is automatically synced between your local environment and the container, so this only needs to be done once per session (or when changing settings in `local.env`).

Within the docker container:

- `make` to see useful commands for this application
- Install local dependencies: `make install`
- Install local dependencies: `make install` (this is done automatically when starting the bash shell, but if you make changes you can run this again without rebuilding and it should be fine)
- Run tests: `make test`
- Run linters: `make lint`
- Run dev server: `make run-dev`
- access localhost:5000 and localhost:5000/ping
- (optional) run prod-mode server locally: `make run-prod`
- access localhost:8000 and localhost:8000/ping

## Required ENV
Copy link

@ghukill ghukill Mar 29, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Minor nit that this markdown renders a little funky (github rendered form). Took me awhile to build all the correct env vars as the newlines appear to be dropped. But seeing it here in the code view, I see the intent and how it would have been easier to discern.

That said, had all the info I needed to get it up and running.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

With new render, easy to parse!


These values should be set in a `.env` file in the project root for development, and in Heroku config when deployed.

### Application settings

`FLASK_APP` = cdnauth
`FLASK_ENV` = production/development/testing as appropriate

See [Flask docs](https://flask.palletsprojects.com/en/2.3.x/config/#SECRET_KEY) for information on how to properly generate a secret key. Should be unique for each deployment (stage/prod/local).
`SECRET_KEY` = generate a long random string. Used for session security.

`COOKIE_NAME` = This needs to be the same value in this app and the lambda
`COOKIE_DOMAIN` = This needs to match the domain the app and cdn are running in. The app and lambda _must_ run in the same domain. NOTE: in development we set this to `False` due to how cookies work with localhost. Setting the domain to `localhost` is rejected by most browsers. Not setting a value works as expected with localhost.

`JWT_SECRET` = This must be a long random string and be set to the same value for this app and our lambda


### Identity Provider (IdP) Settings

See [our dev docs](https://mitlibraries.github.io/guides/authentication/touchstone_saml.html#configuring-the-application) for how to obtain the IDP settings
`IDP_CERT` = standard IST IDP setting
`IDP_ENTITY_ID` = standard IST IDP setting
`IDP_SSO_URL` = standard IST IDP setting

### Service Provider (SP) settings

Note: See [our dev docs](https://mitlibraries.github.io/guides/authentication/touchstone_saml.html#generating-a-self-signed-certificate-for-touchstone) for information on how to generate SP key/cert. They should be unique for each deployment and backed up to our shared LastPass.

`SP_ACS_URL` = route in this app that handles the response from IDP. domain name of app + /saml/?acs
`SP_CERT` = obtained from self signed cert generated for this app. Note: remove all spaces/linebreaks as well as the "BEGIN" and "END" lines from file for ENV setting.
`SP_ENTITY_ID` = domain name of app + /saml
`SP_KEY` = obtained from self signed key generated for this app
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just noting that SP_CERT and SP_KEY were multiline after I ran through the -- very helpful -- instructions in page linked above. I realized I wasn't immediately sure how to set a multiline env var in a .env file. Probably better ways to do it, but I popped them into python as a multiline string, and then could access the string variable which included the \n characters, which allowed me to set them like:

SP_CERT="-----BEGIN CERTIFICATE-----\nabc123...\ndef456...\n-----END CERTIFICATE-----"
SP_KEY="-----BEGIN PRIVATE KEY-----\nabc123...\ndef456...\n-----END PRIVATE KEY-----"

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah this should likely be clarified better in the docs... I just manually remove the line breaks to create a single line string. I feel like a make stringify_cert_or_key command that can spit out what works from an input key or cert would be useful. I've just been too lazy to write one.

`SP_SECURITY_ASSERTIONS_ENCRYPTED` (optional) = Boolean. Defaults to `True` in production and `False` in development.
`URN_UID` (optional) = where in the SAML response to get the user info from. Default values are set to work with Touchstone in production and our test IdP in development.

### Running a local Identity Provider (IdP)

Touchstone is our production IdP, but cannot be used for development work.

If you are working on a feature and you want to test the full authentication process, using a local IdP can help. It won't be exactly the same as Touchstone so you'll want to test closely in staging, but it often is helpful enough to be worth using.

#### Using Simple SAML IdP in a Container

The docker composer file `idp-compose.yaml` is configured to allow an SP (this app!) to connect.

To start the IdP

```bash
docker compose -f idp-compose.yaml up
```

The IdP comes pre-configured with test users:

```text
name: user1
password: password

name: user2
password: password
```

If you need to access the IdP admin interface, the credentials are:

```text
name: admin
password: secret
```

Your `.env` file will need to be updated to have the following values for IdP related settings:

```yaml
IDP_CERT=MIICmjCCAYICCQDX5sKPsYV3+jANBgkqhkiG9w0BAQsFADAPMQ0wCwYDVQQDDAR0ZXN0MB4XDTE5MTIyMzA5MDI1MVoXDTIwMDEyMjA5MDI1MVowDzENMAsGA1UEAwwEdGVzdDCCASIwDQYJKoZIhvcNAQEBBQADggEPADCCAQoCggEBAMdtDJ278DQTp84O5Nq5F8s5YOR34GFOGI2Swb/3pU7X7918lVljiKv7WVM65S59nJSyXV+fa15qoXLfsdRnq3yw0hTSTs2YDX+jl98kK3ksk3rROfYh1LIgByj4/4NeNpExgeB6rQk5Ay7YS+ARmMzEjXa0favHxu5BOdB2y6WvRQyjPS2lirT/PKWBZc04QZepsZ56+W7bd557tdedcYdY/nKI1qmSQClG2qgslzgqFOv1KCOw43a3mcK/TiiD8IXyLMJNC6OFW3xTL/BG6SOZ3dQ9rjQOBga+6GIaQsDjC4Xp7Kx+FkSvgaw0sJV8gt1mlZy+27Sza6d+hHD2pWECAwEAATANBgkqhkiG9w0BAQsFAAOCAQEAm2fk1+gd08FQxK7TL04O8EK1f0bzaGGUxWzlh98a3Dm8+OPhVQRi/KLsFHliLC86lsZQKunYdDB+qd0KUk2oqDG6tstG/htmRYD/S/jNmt8gyPAVi11dHUqW3IvQgJLwxZtoAv6PNs188hvT1WK3VWJ4YgFKYi5XQYnR5sv69Vsr91lYAxyrIlMKahjSW1jTD3ByRfAQghsSLk6fV0OyJHyhuF1TxOVBVf8XOdaqfmvD90JGIPGtfMLPUX4m35qaGAU48PwCL7L3cRHYs9wZWc0ifXZcBENLtHYCLi5txR8c5lyHB9d3AQHzKHMFNjLswn5HsckKg83RH7+eVqHqGw==
IDP_ENTITY_ID=http://localhost:8080/simplesaml/saml2/idp/metadata.php
IDP_SSO_URL=http://localhost:8080/simplesaml/saml2/idp/SSOService.php
```

Note: It's unclear if that IdP cert is fully stable, but so far it has survived a few container rebuilds. If it stops working, remove it from this README and you can get the proper value after the IdP is running from the metadata at:

<http://localhost:8080/simplesaml/saml2/idp/metadata.php?output=xhtml>

Remember while these are the IdP settings to change in `.env`, you will still need to configure the rest of this application appropriately including the SP related config.

If using `make app-bash` followed by `make run-dev`, these values are likely what you want to use.

```yaml
SP_ACS_URL=http://localhost:5000/saml/?acs
SP_ENTITY_ID=http://localhost:5000/saml
```

You should generate a cert/key combo to populate `SP_CERT` and `SP_KEY`. See `Service Provider (SP) settings` above for details.
17 changes: 9 additions & 8 deletions cdnauth/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -3,26 +3,27 @@

from flask import Flask

from cdnauth import auth, cdn, debug

app = Flask(__name__, instance_relative_config=True)
app.register_blueprint(auth.bp)
app.register_blueprint(debug.bp)
app.register_blueprint(cdn.bp)


flask_env = os.getenv("FLASK_ENV")

if flask_env == "development":
app.config.from_object("cdnauth.config.DevelopmentConfig")
elif flask_env == "production":
app.config.from_object("cdnauth.config.Config")
else:
elif flask_env == "testing":
app.config.from_object("cdnauth.config.TestingConfig")
else:
app.config.from_object("cdnauth.config.Config")
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is interesting! Never seen app.config.from_object(...). But I dig it. Seems like it streamlines updating an object with worrying about getter/setter considerations.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I must admit I copied that from our old bd_auth app (which I also wrote but don't remember doing) and I suspect strongly that it copied it from ebooks https://github.com/MITLibraries/ebooks/blob/main/ebooks/__init__.py#L18-L23. It's convenient that the various configs inherit from the base one so you can read everything from env by default but in testing just hard code the values you want to be consistent in a fairly clean way. You only have to set the differences from the base is another way to say that I guess.



logging.basicConfig(level=logging.INFO)


@app.route("/")
def root():
return "Hallo!"


@app.route("/ping")
def ping():
return "pong"
119 changes: 119 additions & 0 deletions cdnauth/auth.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,119 @@
import json
from functools import wraps
from urllib.parse import urljoin, urlparse

from flask import (
Blueprint,
current_app,
make_response,
redirect,
request,
session,
url_for,
)

from onelogin.saml2.auth import OneLogin_Saml2_Auth


def is_safe_url(target):
ref_url = urlparse(request.host_url)
test_url = urlparse(urljoin(request.host_url, target))
return test_url.scheme in ("http", "https") and ref_url.netloc == test_url.netloc


def login_required(f):
@wraps(f)
def decorated_function(*args, **kwargs):
if "samlNameId" not in session:
return redirect(url_for("auth.saml", sso=True, next=request.url))

return f(*args, **kwargs)

return decorated_function


def load_saml_settings():
json_settings = {}
with open("saml/settings.json", "r") as json_file:
json_settings = json.load(json_file)
json_settings["debug"] = current_app.config["DEBUG"]
json_settings["sp"]["entityId"] = current_app.config["SP_ENTITY_ID"]
json_settings["sp"]["assertionConsumerService"]["url"] = current_app.config[
"SP_ACS_URL"
]
json_settings["sp"]["x509cert"] = current_app.config["SP_CERT"]
json_settings["sp"]["privateKey"] = current_app.config["SP_KEY"]
json_settings["idp"]["entityId"] = current_app.config["IDP_ENTITY_ID"]
json_settings["idp"]["singleSignOnService"]["url"] = current_app.config[
"IDP_SSO_URL"
]
json_settings["idp"]["x509cert"] = current_app.config["IDP_CERT"]
json_settings["security"]["wantAssertionsEncrypted"] = current_app.config[
"SP_SECURITY_ASSERTIONS_ENCRYPTED"
]

return json_settings


def prepare_flask_request(request):
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Feeling like maybe a docstring here would be helpful. I had to poke around a bit to understand that the variable req which is set from the output of this function (around line 76), is a dictionary in the expected form that the python3-saml class OneLogin_Saml2_Auth expects for init.

Otherwise, I wondered a bit why we'd collect things from the request object that are readily available.

url_data = urlparse(request.url)
return {
"https": "on" if request.scheme == "https" else "off",
"http_host": request.host,
"server_port": url_data.port,
"script_name": request.path,
"get_data": request.args.copy(),
"post_data": request.form.copy(),
}


bp = Blueprint("auth", __name__, url_prefix="/saml")


@bp.route("/", methods=("GET", "POST"))
def saml():
saml_settings = load_saml_settings()
req = prepare_flask_request(request)
auth = OneLogin_Saml2_Auth(req, saml_settings)
errors = []
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Don't think this errors = [] is needed? Believe that errors = auth.get_errors() below will init the list, and it's only used under that logic branch.

next_page = request.args.get("next")
if not next_page or is_safe_url(next_page) is False:
next_page = ""

if "sso" in request.args:
return redirect(auth.login(return_to=next_page))

elif "acs" in request.args:
auth.process_response()
errors = auth.get_errors()
if not auth.is_authenticated():
# TODO: return something helpful to the user.
# That said, this should never happen.
pass
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this because the lambda edge before this would ensure that any requests that make it this far are authenticated? Or am I thinking about this wrong?

I'd propose one of two things given this TODO:

  1. a docstring for the route as a whole, explaining this situation where a non-authenticated request makes it this far
  2. maybe raise an explicit exception here making it clear to those just reading code that this state really shouldn't be encountered

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I non-authenticated request should never get this far because Touchstone does not send non-authenticated users back here... if they fail authentication they stay in Touchstone... and if they are not authenticated we send them to Touchstone. This code block is from the example python3-saml provides so I suspect it was just to show what is going on more than making sense to actually do anything. I'll try to make it a lot clearer in the code what is going on. I suspect maybe just logging an exception if this block ever gets run so we'll know I was completely wrong and we need to write code or not :)

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Missed this on the first review (and same for IdP response parsing error below): are we okay with a pass for these conditionals, where this route will then effectively return None by default?

I'm assuming a human wouldn't interact with this route, just the authentication handshaking... I think if a Flask route returns None, the actual HTTP response is a 204 - no content response. Assuming that's okay for whatever is interacting with this route?

If not, we could put a fall through return at the end of the route function with a more explicit response; though unsure if that's helpful or if we know what that response should be (e.g. 500 error + explanation for why).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I (sort of) noted that concern myself in the latest commit message. I'll open a ticket before merging this work and link it back to this discussion thread to address this. Your suggestion to maybe fallback to some sort of error page if the more specific expected responses aren't found makes a ton of sense. I stared at it for a bit and wasn't sure how I wanted to handle it other than wanting to handle it better, so I appreciate that option being shared.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if len(errors) == 0:
session["samlUserdata"] = auth.get_attributes()
session["samlNameId"] = session["samlUserdata"][
current_app.config["URN_UID"]
][0]
session["samlSessionIndex"] = auth.get_session_index()
return redirect(request.form["RelayState"])
else:
print("Errors: %s", errors)
print("Last error reason: %s", auth.get_last_error_reason())
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unsure if it's something we want to tackle now, but we might want to consider moving these to logging statements. Perhaps the overhead here is setting up logging globally for the full Flask app.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh god I have no idea why these are print statements. I suspect I was debugging and forgot to clean up after I got it all working. Oops :)



@bp.route("/metadata/")
def metadata():
saml_settings = load_saml_settings()
req = prepare_flask_request(request)
auth = OneLogin_Saml2_Auth(req, saml_settings)
settings = auth.get_settings()
metadata = settings.get_sp_metadata()
errors = settings.validate_metadata(metadata)

if len(errors) == 0:
resp = make_response(metadata, 200)
resp.headers["Content-Type"] = "text/xml"
else:
resp = make_response(", ".join(errors), 500)
return resp
Loading