-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Adds SAML SP #3
Adds SAML SP #3
Conversation
Why are these changes being introduced: * A SAML SP is method we use to interact with MIT Touchstone authentication Relevant ticket(s): * https://mitlibraries.atlassian.net/browse/GDT-121 How does this address that need: * Borrowing heavily from our previous application `bd_auth`, this implements a SAML SP via `python3-saml` * a new make command `make app-bash` was created to streamline local development of this application while avoiding a problematic dependency What is missing: * unit tests. We need to set this app aside for a bit, but want to push forward with the Touchstone registration so we will eliminate that delay once we return to this. We should add unit tests to this work when we move forward with implementing the business logic. Document any side effects to this change: * Development is now expected to be in docker as libxmlsec1 dependency is complicated (at best) to get installed reliably locally * mypy linter was removed as we aren't using types in this application
See https://elements.heroku.com/buildpacks/heroku/heroku-buildpack-apt for more information on how this works
71983cf
to
9aeaca8
Compare
Why are these changes being introduced: * Users directed to this application for authentication will be redirected to Touchstone and then be granted a JWT token and redirected back to the originally requested resource Relevant ticket(s): * https://mitlibraries.atlassian.net/browse/GDT-122 * https://mitlibraries.atlassian.net/browse/GDT-101 How does this address that need: * Accessing cdn auth with no cdn_resource parameter or a cdn_resource parameter that is not in one of our CDNs is rejected as invalid * A user with an invalid, expired, or missing JWT token is redirected to Touchstone * Valid users coming back from Touchstone will have a valid JWT token set as a domain cookie * Users with a valid JWT token and valid cdn_resource are redirected to the resource (which is then intercepted by the cdn auth lambda to validate the token before sending them the requested file) * A preconfigued local IdP can be run in a container for development purposes (only SP key/cert need to be generated manually and set in .env prior to running `make app-bash`) Document any side effects to this change: * Adding tests for the SAML reponse handling are still required, but tests exist for all of the core business logic. This has been opened as https://mitlibraries.atlassian.net/browse/GDT-250. For now, it is expected that we will manually confirm SAML functionality in our staging environment prior to moving changes into production.
9aeaca8
to
a56b7b9
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Overall, I think it looks good! Easy to follow and poke around. Also like the blueprints approach.
I left a few comments: a couple of format-y nit picks, some notes on getting it up and running locally for me, and a couple suggestsions of docstrings. But all optional, nothing blocking.
I see in the PR comments that typing is not applied to this repository. If we did want to go back and add typing, that could also be a good time to drop in a couple docstrings.
- Run tests: `make test` | ||
- Run linters: `make lint` | ||
- Run dev server: `make run-dev` | ||
- access localhost:5000 and localhost:5000/ping | ||
- (optional) run prod-mode server locally: `make run-prod` | ||
- access localhost:8000 and localhost:8000/ping | ||
|
||
## Required ENV |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Minor nit that this markdown renders a little funky (github rendered form). Took me awhile to build all the correct env vars as the newlines appear to be dropped. But seeing it here in the code view, I see the intent and how it would have been easier to discern.
That said, had all the info I needed to get it up and running.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
With new render, easy to parse!
- pipenv in your python 3.12 install: `pip install pipenv` | ||
- note: failing to do this may not result in a noticeable problem, but this is still good practice as it will ensure you are using a `pipenv` within the correct version of python instead of one from your default version. Do this every time you install a new python version. | ||
- note: a python virtual environment will be created via `pipenv` when following the next steps if one does not yet exist. If you run into issues at any point, you can remove that virtualenv with `pipenv --rm` from the project root and start over. | ||
This application expects to be developed in docker. [There are dependency issues with libxmlxec1 on macos](https://github.com/SAML-Toolkits/python3-saml/issues/356) at this time and rather than each of us fighting with that, we will develop inside the docker container so it can be solved programatically and repeatably. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Before I read this and the make app-bash
approach, I struggled to get things installed as well. Ultimately I did get it installed, where this github issue workaround worked well for me.
Related, I also needed to pip install setuptools
, as it sounds like disutils
does not ship with python 3.12+.
With those two things done, make install
worked as expected, and I'm able to run make run-dev
and fire up the flask app locally (outside of a docker container).
All that said, I think the docker approach also works well! Nice to have options for developer's choice.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Glad (sort of) you were able to get it running locally in addition to trying out the docker approach. The upstream requirement issues are starting to resolve themselves and this should be much more easily installable locally soon(TM).
README.md
Outdated
`SP_CERT` = obtained from self signed cert generated for this app. Note: remove all spaces/linebreaks as well as the "BEGIN" and "END" lines from file for ENV setting. | ||
`SP_ENTITY_ID` = domain name of app + /saml | ||
`SP_KEY` = obtained from self signed key generated for this app |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just noting that SP_CERT
and SP_KEY
were multiline after I ran through the -- very helpful -- instructions in page linked above. I realized I wasn't immediately sure how to set a multiline env var in a .env
file. Probably better ways to do it, but I popped them into python as a multiline string, and then could access the string variable which included the \n
characters, which allowed me to set them like:
SP_CERT="-----BEGIN CERTIFICATE-----\nabc123...\ndef456...\n-----END CERTIFICATE-----"
SP_KEY="-----BEGIN PRIVATE KEY-----\nabc123...\ndef456...\n-----END PRIVATE KEY-----"
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah this should likely be clarified better in the docs... I just manually remove the line breaks to create a single line string. I feel like a make stringify_cert_or_key
command that can spit out what works from an input key or cert would be useful. I've just been too lazy to write one.
@bp.route("/") | ||
@login_required | ||
def item(): | ||
return render_template("debug.html") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Dig it! This is really handy to have.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is an interesting side effect the way this works. The cookie setting only happens in the cdn
module instead of the auth
module so while we can debug the session like this, we can't actually debug the cookies because they don't get set. I think eventually we should see if we can move the cookie setting logic into auth
. That added enough complexity to not include that refactor in this initial version because you set cookies as part of the response which is not the domain if auth
so I left it alone for now.
return json_settings | ||
|
||
|
||
def prepare_flask_request(request): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Feeling like maybe a docstring here would be helpful. I had to poke around a bit to understand that the variable req
which is set from the output of this function (around line 76), is a dictionary in the expected form that the python3-saml
class OneLogin_Saml2_Auth
expects for init
.
Otherwise, I wondered a bit why we'd collect things from the request
object that are readily available.
cdnauth/auth.py
Outdated
print("Errors: %s", errors) | ||
print("Last error reason: %s", auth.get_last_error_reason()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Unsure if it's something we want to tackle now, but we might want to consider moving these to logging statements. Perhaps the overhead here is setting up logging globally for the full Flask app.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh god I have no idea why these are print statements. I suspect I was debugging and forgot to clean up after I got it all working. Oops :)
cdnauth/cdn.py
Outdated
# valid_redirect(url) ensures the redirect is to a domain we trust to prevent | ||
# our app being used to trick users into clicking on malicious links | ||
def valid_redirect(url): | ||
domain = urlparse(url).netloc | ||
if domain in valid_domains(): | ||
return True | ||
return False |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Small nit: these could be """...."""
docstrings for the function. Same below for valid_domains()
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's good input. I just read the PEP on that and agree it makes sense to move this into the method as a docstring . It still feels backwards to document inside the method instead of before the method as Python is not how I think :)
cdnauth/cdn.py
Outdated
# valid_domains() is a list of domains we trust for redirect purposes | ||
def valid_domains(): | ||
return [ | ||
"cdn.libraries.mit.edu", | ||
"cdn.dev1.mitlibrary.net", | ||
"cdn.stage.mitlibrary.net", | ||
] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Might propose setting this as a constant at the top of the file, something along the lines of:
VALID_DOMAINS = [...]
Or, even as a property on the Config
class (probably only if accessed from multiple files/modules).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I like moving it to Config
so we could set it but allow overriding via ENV. It's actually convenient in development to allow redirects to example.com
(or similar) to allow focusing on business logic and not actually invoking a CDN call.
app.config.from_object("cdnauth.config.TestingConfig") | ||
else: | ||
app.config.from_object("cdnauth.config.Config") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is interesting! Never seen app.config.from_object(...)
. But I dig it. Seems like it streamlines updating an object with worrying about getter/setter considerations.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I must admit I copied that from our old bd_auth
app (which I also wrote but don't remember doing) and I suspect strongly that it copied it from ebooks
https://github.com/MITLibraries/ebooks/blob/main/ebooks/__init__.py#L18-L23. It's convenient that the various configs inherit from the base one so you can read everything from env by default but in testing just hard code the values you want to be consistent in a fairly clean way. You only have to set the differences from the base is another way to say that I guess.
<!DOCTYPE html> | ||
<html> | ||
<head> | ||
<meta http-equiv="refresh" content="5; url={{ cdn_resource }}" /> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Probably a silly question, but does this <meta>
tag force a refresh of the page, with that CDN link as the URL, thereby prompting a download? If so, cool!
Also, for people like me unfamiliar with this behavior, might be nice to have a small HTML comment above this that notes this line should force a page refresh.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's good input. I think everyone in EngX would immediately understand what this does because we live on the HTML side so much more than DataEng. I'll see if I can find a way to better document that (possibly with a comment to link to external docs that explain it clearly). For now it has a 5 second delay before the refresh... once we launch we'll likely set this to an immediate redirect
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for adding this. And, while unsure if necessary in code, thanks for the context about this template page serving as a way to navigate away from the Touchstone page; makes sense.
Why are these changes being introduced: * Code Review feedback Relevant ticket(s): * https://mitlibraries.atlassian.net/browse/GDT-122 How does this address that need: * Adds docstrings to complex methods * Adds useful logging, including some errors * Fixes formating of README * Moves list of `VALID_DOMAINS` to an ENV overridable config option Document any side effects to this change: * There are two known states we don't handle well still. They are noted by `TODO`s in the code (not a great practice) and will receive a Jira ticket to consider how best to handle these possible but unlikely states.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Approved! Thanks for the comments and docstrings, I think helpful for a quick footing jumping in.
My only question -- and non blocking -- was around the /saml
route and whether or not pass
+ default return of None
are sufficient. But given testing, most likely is!
- Run tests: `make test` | ||
- Run linters: `make lint` | ||
- Run dev server: `make run-dev` | ||
- access localhost:5000 and localhost:5000/ping | ||
- (optional) run prod-mode server locally: `make run-prod` | ||
- access localhost:8000 and localhost:8000/ping | ||
|
||
## Required ENV |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
With new render, easy to parse!
- `SP_CERT` = obtained from self signed cert generated for this app. | ||
- Note: remove all spaces/linebreaks as well as the "BEGIN" and "END" lines from file for ENV setting. | ||
- `SP_ENTITY_ID` = domain name of app + /saml | ||
- `SP_KEY` = obtained from self signed key generated for this app | ||
- Note: remove all spaces/linebreaks as well as the "BEGIN" and "END" lines from file for ENV setting. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for adding these notes.
|
||
|
||
def prepare_flask_request(request): | ||
"""Return a dictionary in a format that OneLogin_Saml2_Auth uses during init""" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍 to docstring
"""This route handles two important situations. | ||
|
||
The first, "sso" path, redirects a user to the IdP and sets a value of where to redirect | ||
the user to after they come back to this application. "next_page" is the resource in our | ||
CDN they originally requested but were redirected to this auth app because they were not | ||
authenticated. Keeping track of what they wanted is essential to the functionality of | ||
this solution. | ||
|
||
The second, "acs" path is handles the response back from the IDP. We should never encounter | ||
the path with "acs" in which a user is not authenticated, as they would not be redirected | ||
back from the IDP to us and not be authenticated. | ||
""" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Super helpful, thanks for adding.
# That said, this should never happen. | ||
# TODO: return a useful error to the user | ||
logging.error("User returned from IdP with no authentication") | ||
logging.error(f"Errors: {errors}") | ||
pass |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Missed this on the first review (and same for IdP response parsing error below): are we okay with a pass
for these conditionals, where this route will then effectively return None
by default?
I'm assuming a human wouldn't interact with this route, just the authentication handshaking... I think if a Flask route returns None
, the actual HTTP response is a 204 - no content
response. Assuming that's okay for whatever is interacting with this route?
If not, we could put a fall through return
at the end of the route function with a more explicit response; though unsure if that's helpful or if we know what that response should be (e.g. 500
error + explanation for why).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I (sort of) noted that concern myself in the latest commit message. I'll open a ticket before merging this work and link it back to this discussion thread to address this. Your suggestion to maybe fallback to some sort of error page if the more specific expected responses aren't found makes a ton of sense. I stared at it for a bit and wasn't sure how I wanted to handle it other than wanting to handle it better, so I appreciate that option being shared.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
VALID_DOMAINS = os.getenv( | ||
"VALID_DOMAINS", | ||
default="cdn.libraries.mit.edu,cdn.dev1.mitlibrary.net,cdn.stage.mitlibrary.net", | ||
) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍 - seeing the associated README
docs for this env var as well.
<!DOCTYPE html> | ||
<html> | ||
<head> | ||
<meta http-equiv="refresh" content="5; url={{ cdn_resource }}" /> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for adding this. And, while unsure if necessary in code, thanks for the context about this template page serving as a way to navigate away from the Touchstone page; makes sense.
Why are these changes being introduced:
Relevant ticket(s):
How does this address that need:
bd_auth
, this implements a SAML SP viapython3-saml
make app-bash
was created to streamline local development of this application while avoiding a problematic dependency.env
file (see README for the env values to set these to)make app-bash
make run-dev
http://localhost:5000/?cdn_resource=https://cdn.libraries.mit.edu/fakefile
. You should be prompted to authenticate (credentials in README) and then redirect to a "file not found" page for the CDN. You cannot access actually restricted files in the CDN when running locally like this as we pass JWT tokens as domain cookies which cannot be read. You could set a valid domain cookie by modifying /etc/hosts and running your local dev environment on something liketotallyrealserver.libraries.mit.edu
. You'd need to pull the appropriate JWT secret from SSM.What is missing:
Document any side effects to this change: