Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Reconcile Tutor and OEP-45 #34446

Open
kdmccormick opened this issue Jan 28, 2022 · 5 comments
Open

Reconcile Tutor and OEP-45 #34446

kdmccormick opened this issue Jan 28, 2022 · 5 comments
Assignees
Labels
discovery Pre-work to determine if an idea is feasible

Comments

@kdmccormick
Copy link
Member

kdmccormick commented Jan 28, 2022

Context

OEP-45 makes declarations about how Open edX services should be structured & packaged as to simplify configuration and deployment. Tutor is aligned with some of these guidelines, such as:

  • packaging services as container images
  • using idomatic, cache-friendly, Ansible-free Dockerfiles
  • separation of (a) settings that are required to be specified by operator vs. (b) settings that have production-ready defaults

but is at odds with others, such as:

  • configuration via YAML (OEP-45) instead of Python settings files (Tutor)
  • placement of build scripts in service repos (OEP-45) instead of plugin repos (Tutor)

As the official deployment strategy and soon-to-be official developer stack, Tutor ought to be a faithful implementation of any framework that the Open edX community sets around configuration and deployment.

Acceptance

Either:

  1. change Tutor to follow OEP-45,
  2. change OEP-45 to encompass Tutor,
  3. do a little bit of both 1 and 2, or
  4. revoke OEP-45.
@regisb
Copy link
Contributor

regisb commented Feb 4, 2022

I think that OEP-45 makes sense for the most part, but I disagree with some of it. More specifically, defining settings in YAML files: I'm just not sure what we gain by moving values from *.py modules to *.yaml files. What we lose is the ability to inherit/import values from one base configuration to another. This makes it more difficult for applications to override existing behaviour. At the end of the day, we will always need python modules to override complex settings, such as LOGGING handlers, MIDDLEWARES and INSTALLED_APPS. If we need to define different yaml files for development, production and testing, then we will end up with a lot of duplication.

I would love to have better build and management scripts in the upstream repositories. For instance, I tried multiple times to get rid of the custom openedx-assets script. But sometimes the lack of convenient management scripts in the upstream repos makes it difficult to do thinks in a user-friendly way.

It would be worth it to make Tutor comply more with OEP-45, but to achieve that we will also have to make quite a few changes in the upstream repositories. Here is what the process would look like:

  1. Identify some complex/weird piece of the tutor codebase. For instance: the jobs.py module, the waffle toggle command, the ad-hoc MinIO storage class for user tasks.
  2. Push changes upstream that make these pieces of code irrelevant.
  3. Make the new feature work on Tutor nightly.

OEP-45 is based on good common sense. By making Tutor comply more with common sense, we will get closer to reconciliation with OEP-45.

@jmbowman
Copy link
Contributor

We discussed OEP-45 again in Arch Hour this morning, as we're considering having Arbi-BOM dive into implementation soon if we still agree it's a good path forward. The topic of YAML vs. Python files came up, the stated advantages of YAML were:

  • Easier to define both required settings and overrides in the same file, as the file can be loaded into the settings twice. The first time would provide any required settings (which could then be used in the Python files to derive other settings related to them), while the second time would allow overriding settings that have defaults (including settings derived from the required ones). It's harder to do something like this with a Python file, which can typically only be imported once.
  • Easier to write a validator for the settings (which we haven't done yet, but may want to in the future)
  • The YAML data can be provided as a relatively arbitrary path or URL, whereas Python files have to reside somewhere in the PYTHONPATH of the local filesystem.

That said, we also said that we should consider Python files instead if we find other reasons that make them work better in practice (which is entirely possible given that it's the Django standard). I will say that in your example of complex settings, edX/2U explicitly moved away from layered Python overrides to YAML files with some duplication because even 1-2 layers of overrides rapidly overwhelmed the ability of even senior developers to reason about how the layering worked and which settings were actually in effect.

And regarding changes in the upstream repositories...I think we're ready to start doing that now. If there are specific changes that would help simplify things immediately, feel free to suggest them for tackling early in the process.

@regisb
Copy link
Contributor

regisb commented Jun 16, 2022

I always say that there are only two things that are difficult about Open edX: one is static assets, and the other is settings.

While I think it's great that 2U is thinking about improving the design of LMS/CMS settings, the choice of YAML for storing settings should be just an implementation detail that is specific to 2U. If people want to store settings in YAML, JSON, TOML, environment variables, Ansible Vault or Consul, then that's great; it's not my concern, and it shouldn't be 2U's either.

The current implementation of the production.py settings forces Tutor to make use of lms.yml/cms.yml files. The alternative would be to bypass the default production setting files and to re-write production.py settings from scratch -- but that seemed too difficult to me when I started work on Tutor. So we went with minimal lms/cms.yml files.

What we should be focusing on is to provide better default settings, both in development and production. Better defaults would make everyone's lives much easier, both at 2U and outside. To understand the difficulties we have with settings, you just need to look at the implementation of the Tutor settings and wonder: why do we have to override so many values?

Let's have a look at a few examples:

  1. Source: in development, we need to override 9 LMS settings when we modify the LMS host.
  2. Source: in production, we need to modify 3 standard settings depending on whether we use SSL/TLS.
  3. Source: the default values of these common LMS settings could probably be modified for everyone.
  4. Source: when we modify the Mongodb host, we need to modify it in many different places.

In some of these examples (though not all) the common thread is that we need to load "base/required" settings and use them to infer other values. You want to load these base settings from YAML, and I'm fine with that -- but don't force this choice on everyone. Instead, provide a generic mechanism to allow anyone to load settings from whichever source they like. Here's what an actual implementation would look like in lms/common.py:

# Provide good defaults
ENABLE_SSL = False
MONGODB_HOST = "mongodb"
...

try:
    # 2U's common_private.py would load values from yaml files
    from .common_private import *
except ImportError:
    logger.warning("Private setting module could not be loaded")

# Infer other settings
if ENABLE_SSL:
    SESSION_COOKIE_SAMESITE = "None"
...

@jmbowman
Copy link
Contributor

How about we just modify the OEP to accept more formats for the config file? It would still be specified by an <APPNAME>_CFG_PATH environment variable but could be a YAML file, a Python file on the PYTHONPATH, or perhaps even JSON, TOML, dotenv, etc. There are settings loaders in https://github.com/rochacbruno/dynaconf/tree/master/dynaconf/loaders that might be useful for this. The built-in settings file would then just look at the file extension (or lack thereof) and pick the appropriate loader for the specified file.

Most of the examples you gave feel like demonstrations of "we aren't handling derived settings well". There's https://github.com/openedx/edx-platform/blob/master/openedx/core/lib/derived.py , but not many people know about it and the resulting settings declarations look a little odd. I think the approach of loading the custom settings file twice (that I mentioned in my last comment) would probably work better. The file would only need to be parsed once, but the parsed values would be used to set/override settings both near the beginning of the core settings file (to specify any required settings like host and ENABLE_SSL which other settings depend on) and again near the end (in case you want to specify a value for a derived setting that doesn't match the default formula).

I'm not sure how you'd handle mutation of complex settings like LOGGING, though (vs. just re-declaring the whole thing). It feels like that would require a specific provision of an optional Python file to load at the end of settings parsing which would be passed the existing settings module to mutate (this is roughly what that derived.py module I mentioned above does). We could add that, but experience would lead me to slap a big "you probably don't really want to do this" disclaimer on that feature.

@regisb
Copy link
Contributor

regisb commented Jun 16, 2022

How about we just modify the OEP to accept more formats for the config file?

My suggestion is that OEP-45 should completely ignore the format of the config file. Django is great for loading settings coming from different sources (proof being that a tool like dynaconf exists) and Open edX should not even try to improve on that.

It would still be specified by an _CFG_PATH environment variable

To require the use of a single config file is an opinionated choice that is not just unnecessary: it's also extra work for people who want to do things differently.

There are settings loaders in dynaconf that might be useful for this.

Dynaconf should not be a mandatory tool to run any Open edX IDA. In many cases it would be overkill.

Most of the examples you gave feel like demonstrations of "we aren't handling derived settings well".

It's not so much that we don't handle derived settings well, but more that we are not providing good defaults.

We should really be focusing on better defaults. All the rest is implementation-specific.

@kdmccormick kdmccormick added the discovery Pre-work to determine if an idea is feasible label Jun 2, 2023
@kdmccormick kdmccormick transferred this issue from openedx-unsupported/wg-developer-experience Mar 28, 2024
@kdmccormick kdmccormick self-assigned this May 10, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
discovery Pre-work to determine if an idea is feasible
Projects
No open projects
Development

No branches or pull requests

3 participants