Skip to content

Authentication (GDPR friendly)

Jani Tammi edited this page Sep 16, 2020 · 9 revisions

This site aims not get involved with GDPR and its details, and avoids handling personal data. Site still needs a way to authenticate users for accessing certain resources, at the level of being associated with the university or not. In addition, a way to grant teacher privileges is needed (for uploading virtual machine image files) - but this requires consent from a teacher to store the UID (and only the UID).

Solution Model

Observe domain-SSO authentication state and replicate that into an expiring site-specific session.

Solution features

  • Outsources identification and authentication to the domain-SSO services.
  • Does not have access to any other user data except the uid.
  • For a teachers (in order to identify a teacher) the uid is listed in teacher database table, along with created timestamp (when the row was inserted) and status (either active or inactive).
  • For a students, no records of any kind are maintained. No access records, usage statistics nor anything else will have even a trace of student uid anywhere.
  • The authenticated state is simply used when accessing the site to restrict access to such virtual machine images that contain software licensed to UTU students only.

PII and UID

"Personally identifiable information (PII) is any data that can be used to identify a specific individual." By this definition, simple act of writing uid into the teacher table alone (with no other personal data) is technically enough to require application of GDPR rules.

2020-09-14: This solution currently complies with GDPR with teachers, who can get registry extracts (their teacher table row, all three columns) and request data removal via common support email.

Future Alternative

Make uid unidentifiable by one-way hashing it. In this model, teacher table contains hashed values of uids and each time SSO API is consulted, returned uid is hashed and compared against the stored hash values to determine if the given user is a teacher.

  • If the hashed value falls into wrong hands, it is useless for anything.
  • It will become more difficult to remove certain teacher's role. We need a tool to match hashed uid, because using normal database administration tools will not enable us to determine which row should be modified.
  • SHA-256 is sufficient for UID; hashlib.sha256(uid.encode('utf-8')).hexdigest()

Single Sign-On Integration

Solution relies on just three resources;

  1. utu.fi domain-wide SSO authentication cookie
    which contains a session hash value.
  2. SSO REST API endpoints
    to manage session (login/logout) and returns the role of the session.
    • api/sso (GET) - responds with role JSON { 'role': '[anonymous|student|teacher]' }.
    • api/sso/login (GET) - Landing/redirection handler for domain SSO login goto service. When accessed, queries the SSO REST API again and updates the Flask application session, then redirects the client to destination URL.
    • api/sso/logout (GET) - Discards Flask session and sets role as anonymous.
  3. SSO Javascript module
    Code that uses the REST API to add login/logout functionality to the web pages.

User Authentication Model

The above can provide simple "is SSO authenticated or not" information, but because the SSO session validation REST API also provides uid, we can attach additional privileges to identified users. In this solution, only teacher role will be stored. If uid is inserted into the teacher database table, user by that ID is treated as teacher.

For this site, three "roles" are observed:

Role Identified by
anonymous SSO session REST API returns False
student SSO session REST API returns True, but the uid is not listed in the teacher database table.
teacher SSO session REST API returns True and uid is listed in the teacher database table.

Middleware Implementation

Within the pages of vm.utu.fi, the state of authentication, and the role, are maintained in a session storage. Variable UID either contains a value (authenticated) or noes not (not authenticated). Variable ROLE will contain one of three possible values, as listed in the above table. All possible combinations listed in the table below:

UID ROLE
None anonymous
(uid string) student
(uid string) teacher

Session has an expiration which can be defined in the application instance config (instance/application.config) parameter SESSION_LIFETIME (defaults to 60 minutes, if undefined). A value of 30 minutes - used by Nettiopsu and few other services - is perhaps the value this service should also use. Note that each page load resets session expiration. Thus, session expires after inactivity of set number of minutes.

If session does not exist (or has expired), middleware will query the SSO session REST API when creating a new session. This way the user will be automatically logged in, if a valid SSO session exists.

If the user wishes to logout (doesn't really make any sense, but has been implemented for the sake of symmetry), middleware sets session values UID = None and ROLE = anonymous. The user will remain logged out / unauthorised until this session expires due to inactivity - after which the system will normally recreate a session, again automatically logging the user in, if valid domain SSO session exists. There are few pages, such as the download.html, where some of the data elements should be reloaded to reflect the new unauthenticated state. Easiest way is of course to simply reload the whole page, but if the data elements support reloading, perhaps they should be triggered.

Interaction Flows

User arrives to vm.utu.fi, without valid SSO session:

SSO Diagrams - Role check

User clicks "Login":

SSO Diagrams - Login

User clicks "Logout":

SSO Diagrams - Logout

UTU SSO Login

Just like other domain services, this site will direct the browser to SSO login service https://sso.utu.fi/sso/XUI/#login/, which will prompt the user with a login form and create a new SSO session.

Important for services such as this, is the goto URL parameter which is used to redirect the browser back to the service after successful login. Apparently this URL parameter is appended with &, without the usual ? character that marks the beginning of URL parameters. It's value is simply the URI (with or without its own URL parameters) in URL encoded format. In our case, this would be &goto=https%3A%2F%2Fvm.utu.fi%3A443%2Fsso%2flogin%3Fdestination%3D plus the encoded page name from where the login action was initiated from.

User Groups

This section documents some of the early planning for this site and does not reflect the current solution anymore. In the early stages, we attempted to classify and list the different types of users that may access the site:

  1. Teachers and staff
    Privileges to upload images
  2. Students of Turku University
    This group could be subdivided into faculties, but for now, such granularity is not considered necessary.
  3. Open University Students
    These students may not have a record in SSO or LDAP services (pending confirmation from IT-Services), but are still to be allowed to access resources that may be under campus licenses.
  4. Education Export Students
    Students of another university, to where courses have been exported. There are plans to have all these students registered in SSO, but when that actually gets done is anyone's guess... For these students, we may have to resort to onetime passwords.
  5. Everyone else
    This service should not be considered closed for the public. In fact, I my self have ambitions to compose and release fully self-contained self-study packages in a form of preprogrammed virtual machine instance. These should be freely downloadable by anyone.

Now implemented solution groups these as follows:

  • Teachers group(s): 1
    All authenticated users who's uid appears in the teacher table.
  • Students group(s): 2, 3, 4
    All authenticated users.
  • Anonymous group(s): 5
    Non-authenticated user.

If there will be challenges, they are expected to be related to identifying the type of student in finer granule... But for now, we will proceed with this approach and evaluate needs as they surface, instead of solving problems that do not yet exist.