Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make current version of base compatible with auth, and move auth into base #286

Merged
merged 56 commits into from
Jul 10, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
56 commits
Select commit Hold shift + click to select a range
9c8b3b0
Add rollback db on exc
mnazzaro Jun 7, 2024
43b8bfb
Add reconfig db for temp db option in auth
mnazzaro Jun 10, 2024
8df5c45
Add default
mnazzaro Jun 10, 2024
d8f4ed2
Configure db option in models
mnazzaro Jun 11, 2024
a506c79
Return engines
mnazzaro Jun 12, 2024
b566cf7
max overflow for sqlite
mnazzaro Jun 12, 2024
2c7aebf
Remove all the other args
mnazzaro Jun 12, 2024
a4c5d3f
Add tapir policy classes
mnazzaro Jun 12, 2024
6dbcec4
Extra sauce is tapir policy classes
mnazzaro Jun 12, 2024
c37a266
Add server defaults to tapir_users models
mnazzaro Jun 12, 2024
7a3ee21
Small db tweaks, maybe more to come
mnazzaro Jun 12, 2024
8da11a0
Defaults for arXiv_demographics
mnazzaro Jun 12, 2024
fae99dc
Missed veto status col
mnazzaro Jun 12, 2024
4592fe2
Missed veto status col
mnazzaro Jun 12, 2024
3f63f6b
Now defaults for tapir_nicknames
mnazzaro Jun 12, 2024
537bfe4
Add groups property to arXiv_demographics
mnazzaro Jun 12, 2024
a747fc4
defaults for t_arXiv_paper_owners
mnazzaro Jun 17, 2024
734e461
defaults for arXiv_categories
mnazzaro Jun 17, 2024
5967b97
Default for doc dated
mnazzaro Jun 17, 2024
349de8b
Raise again on transaction exception
mnazzaro Jun 17, 2024
8574917
add session relationship to db in tapirsessionsaudit
mnazzaro Jun 17, 2024
c578449
Defaults for TapirSession
mnazzaro Jun 18, 2024
6bfd2f3
don't autoincrement tapir session audit pk/fk
mnazzaro Jun 18, 2024
0cf05de
don't autoincrement tapir session audit pk/fk
mnazzaro Jun 18, 2024
adb431b
Extra inheritance screwing stuff up again
mnazzaro Jun 18, 2024
5b76916
No redis, tests pass
mnazzaro Jun 20, 2024
2e571e2
Remove extra dir level
mnazzaro Jun 20, 2024
1e09f1d
Remove endorsements for Authorizations
mnazzaro Jun 20, 2024
9fe2bda
Remove extra poetry stuff and update README
mnazzaro Jun 20, 2024
91eedce
Remove extra imports and other extra stuff
mnazzaro Jun 20, 2024
dd7b426
Add retry dependency
mnazzaro Jun 20, 2024
0983313
Add pyjwt dep
mnazzaro Jun 20, 2024
a7aa47d
Forgot to lock
mnazzaro Jun 20, 2024
895ac46
What's the diff
mnazzaro Jun 20, 2024
7c9ed91
Tests passing locally
mnazzaro Jun 24, 2024
1aed597
Switch to temporary_db
mnazzaro Jun 24, 2024
c3579f3
Add back SessionStore for the sake of a timely refactor of admin-weba…
mnazzaro Jun 24, 2024
3c1ba52
Merge branch 'develop' into ARXIVCE-1890-auth
mnazzaro Jun 24, 2024
491b8e2
Fix poetry mishap
mnazzaro Jun 24, 2024
658d7ea
Remove auto engine set up because it breaks tests all the time
mnazzaro Jun 24, 2024
7c22141
Tests pass
mnazzaro Jun 24, 2024
f5cc87a
Add some defaults to arXiv_submissions
mnazzaro Jun 25, 2024
c7bf39d
Fix definitions for submission classifier tables
mnazzaro Jun 25, 2024
c31919f
Add default for admin log table
mnazzaro Jun 25, 2024
70902c9
Remove other weird inheritance stuff
mnazzaro Jun 25, 2024
9db91c6
Typo in join clause
mnazzaro Jun 25, 2024
176bb14
Move extra relationship structure from arxiv_db to our model definiti…
mnazzaro Jun 26, 2024
6a77622
Add relationship between oreq and oreq audit
mnazzaro Jun 26, 2024
8bdfeb6
Add documents relationship to oreqs from arXiv_db
mnazzaro Jun 26, 2024
b8b65ea
Needed to add FK's to join table
mnazzaro Jun 26, 2024
836e2d2
Add audit relationship for endorse req
mnazzaro Jun 26, 2024
698d33c
More relationships from arxiv db
mnazzaro Jun 26, 2024
e115779
get rid of bad reference
mnazzaro Jun 26, 2024
3eae8cb
Solidify some of the relationships, replacing backref with back_popul…
mnazzaro Jun 27, 2024
4055511
Fix some typos. Extra s at the end of a bunch of relationship mappings
mnazzaro Jun 27, 2024
fa0ce7f
Add relationship between endorsement requests and endorsements tables
mnazzaro Jul 3, 2024
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
57 changes: 57 additions & 0 deletions arxiv/auth/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,57 @@
# ``arxiv-auth`` Library

This package provides a Flask add on and other code for working with arxiv
authenticated users in arXiv services.

Housing these components in arxiv-base ensures that users
and sessions are represented and manipulated consistently. The login+logout, user
accounts(TBD), API client registry(TBD), and authenticator(TBD) services all
rely on this package.

# Quick start
For use-cases to check if a request is from an authenticated arxiv user, do the
following:

1. Add arxiv-base to your dependencies
2. Install :class:`arxiv.auth.auth.Auth` onto your application. This adds a
function called for each request to Flask that adds an instance of
:class:`arxiv.auth.domain.Session` at ``flask.request.auth`` if the client is
authenticated.
3. Add to the ``flask.config`` to setup :class:`arxiv_auth.auth.Auth` and
related classes

Here's an example of how you might do #2 and #3:
```
from flask import Flask
from arxiv.base import Base
from arxiv.auth.auth import auth

app = Flask(__name__)
Base(app)

# config settings required to use legacy auth
app.config['CLASSIC_SESSION_HASH'] = '{hash_private_secret}'
app.config['CLASSIC_DB_URI'] = '{your_sqlalchemy_db_uri_to_legacy_db'}
app.config['SESSION_DURATION'] = 36000
app.config['CLASSIC_COOKIE_NAME'] = 'tapir_session'

auth.Auth(app) # <- Install the Auth to get auth checks and request.auth

@app.route("/")
def are_you_logged_in():
if request.auth is not None:
return "<p>Hello, You are logged in.</p>"
else:
return "<p>Hello unknown client.</p>"
```

# Middleware

In during NG there was middleware for arxiv-auth that could be used in NGINX to
do the authentication there. As of 2023 it is not in use.

See :class:`arxiv.auth.auth.middleware.AuthMiddleware`

If you are not deploying this application in the cloud behind NGINX (and
therefore will not support sessions from the distributed store), you do not
need the auth middleware.
Empty file added arxiv/auth/__init__.py
Empty file.
10 changes: 10 additions & 0 deletions arxiv/auth/app.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,10 @@
"""test app."""

import sys
sys.path.append('./arxiv')

from flask import Flask
from arxiv_users import auth, legacy

app = Flask('test')
legacy.create_all()
199 changes: 199 additions & 0 deletions arxiv/auth/auth/__init__.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,199 @@
"""Provides tools for working with authenticated user/client sessions."""

from typing import Optional, Union, Any, List
import os

from werkzeug.datastructures.structures import MultiDict

from flask import Flask, request, Response
from retry import retry

from ...db import transaction
from ..legacy import util
from ..legacy.cookies import parse_cookie
from .. import domain, legacy

import logging

logger = logging.getLogger(__name__)


class Auth(object):
"""
Attaches session and authentication information to the request.

Set env var or `Flask.config` `ARXIV_AUTH_DEBUG` to True to get
additional debugging in the logs. Only use this for short term debugging of
configs. This may be used in produciton but should not be left on in production.

Intended for use in a Flask application factory, for example:

.. code-block:: python

from flask import Flask
from arxiv.users.auth import Auth
from someapp import routes


def create_web_app() -> Flask:
app = Flask('someapp')
app.config.from_pyfile('config.py')
Auth(app) # Registers the before_reques auth check

@app.route("/hello")
def hello():
if request.auth:
return f"Hello {request.auth.user.name}!"
else:
return f"Hello world! (not authenticated)"

return app


"""

def __init__(self, app: Optional[Flask] = None) -> None:
"""
Initialize ``app`` with `Auth`.

Parameters
----------
app : :class:`Flask`

"""
if app is not None:
self.init_app(app)
if self.app.config.get('AUTH_UPDATED_SESSION_REF'):
self.auth_session_name = "auth"
else:
self.auth_session_name = "session"

@retry(legacy.exceptions.Unavailable, tries=3, delay=0.5, backoff=2)
def _get_legacy_session(self,
cookie_value: str) -> Optional[domain.Session]:
"""
Attempt to load a legacy auth session.

Returns
-------
:class:`domain.Session` or None

"""
if cookie_value is None:
return None
try:
with transaction():
return legacy.sessions.load(cookie_value)
except legacy.exceptions.UnknownSession as e:
logger.debug('No legacy session available: %s', e)
except legacy.exceptions.InvalidCookie as e:
logger.debug('Invalid legacy cookie: %s', e)
except legacy.exceptions.SessionExpired as e:
logger.debug('Legacy session is expired: %s', e)
return None

def init_app(self, app: Flask) -> None:
"""
Attach :meth:`.load_session` to the Flask app.

Parameters
----------
app : :class:`Flask`

"""
self.app = app
app.config['arxiv_auth.Auth'] = self

if app.config.get('ARXIV_AUTH_DEBUG') or os.getenv('ARXIV_AUTH_DEBUG'):
self.auth_debug()
logger.debug("ARXIV_AUTH_DEBUG is set and auth debug messages to logging are turned on")

self.app.before_request(self.load_session)
self.app.config.setdefault('DEFAULT_LOGOUT_REDIRECT_URL',
'https://arxiv.org')
self.app.config.setdefault('DEFAULT_LOGIN_REDIRECT_URL',
'https://arxiv.org')

if app.config.get('ARXIV_AUTH_DEBUG') or os.getenv('ARXIV_AUTH_DEBUG'):
self.auth_debug()
logger.debug("ARXIV_AUTH_DEBUG is set and auth debug messages to logging is turned on")


def load_session(self) -> Optional[Response]:
"""Look for an active session, and attach it to the request.

The typical scenario will involve the
:class:`.middleware.AuthMiddleware` unpacking a session token and
adding it to the WSGI request environ.

As a fallback, if the legacy database is available, this method will
also attempt to load an active legacy session.

"""
# Check the WSGI request environ for the key, which is where the auth
# middleware puts any unpacked auth information from the request OR any
# exceptions that need to be raised withing the request context.
req_auth: Optional[Union[domain.Session, Exception]] = \
request.environ.get(self.auth_session_name)

# Middlware may raise exception, needs to be raised in to be handled correctly.
if isinstance(req_auth, Exception):
logger.debug('Middleware passed an exception: %s', req_auth)
raise req_auth

if not req_auth:
if util.is_configured():
req_auth = self.first_valid(self.legacy_cookies())
else:
logger.warning('No legacy DB, will not check tapir auth.')

# Attach auth to the request so other can access easily. request.auth
setattr(request, self.auth_session_name, req_auth)
return None

def first_valid(self, cookies: List[str]) -> Optional[domain.Session]:
"""First valid legacy session or None if there are none."""
first = next(filter(bool,
map(self._get_legacy_session,
cookies)), None)

if first is None:
logger.debug("Out of %d cookies, no legacy cookie found", len(cookies))
else:
logger.debug("Out of %d cookies, found a good legacy cookie", len(cookies))

return first

def legacy_cookies(self) -> List[str]:
"""Gets list of legacy cookies.

Duplicate cookies occur due to the browser sending both the
cookies for both arxiv.org and sub.arxiv.org. If this is being
served at sub.arxiv.org, there is no response that will cause
the browser to alter its cookie store for arxiv.org. Duplicate
cookies must be handled gracefully to for the domain and
subdomain to coexist.

The standard way to avoid this problem is to append part of
the domain's name to the cookie key but this needs to work
even if the configuration is not ideal.

"""
# By default, werkzeug uses a dict-based struct that supports only a
# single value per key. This isn't really up to speed with RFC 6265.
# Luckily we can just pass in an alternate struct to parse_cookie()
# that can cope with multiple values.
raw_cookie = request.environ.get('HTTP_COOKIE', None)
if raw_cookie is None:
return []
cookies = parse_cookie(raw_cookie, cls=MultiDict)
return cookies.getlist(self.app.config['CLASSIC_COOKIE_NAME'])

def auth_debug(self) -> None:
"""Sets several auth loggers to DEBUG.

This is useful to get an idea of what is going on with auth.
"""
logger.setLevel(logging.DEBUG)
legacy.sessions.logger.setLevel(logging.DEBUG)
legacy.authenticate.logger.setLevel(logging.DEBUG)
Loading
Loading