Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Clean up auth manager model #42747

Draft
wants to merge 2 commits into
base: main
Choose a base branch
from

Conversation

vincbeck
Copy link
Contributor

@vincbeck vincbeck commented Oct 4, 2024

Many things do no quite make sense today in auth managers:

  • is_authorized_dag("GET") means "is the user authorized to access all DAGs". After discussion with Ash here, I realized it should mean "is the user authorized to list DAGs". We should not care whether the user has access to all DAGs.

  • get_permitted_dag_ids takes a list of methods as parameter. Example: auth_manager.get_permitted_dag_ids(methods=["GET", "PUT"]) would return the list of DAGs the user can read or edit. We need either one of the other but not both.

This PR creates breaking change in the auth manage interface.

This is a draft PR for now to confirm the direction, if agreed, I'll create a news fragment, update tests and handle the backward compatibility between core Airflow and providers.


^ Add meaningful description above
Read the Pull Request Guidelines for more information.
In case of fundamental code changes, an Airflow Improvement Proposal (AIP) is needed.
In case of a new dependency, check compliance with the ASF 3rd Party License Policy.
In case of backwards incompatible changes please leave a note in a newsfragment file, named {pr_number}.significant.rst or {issue_number}.significant.rst, in newsfragments.

query = select(ParseImportError)
query = apply_sorting(query, order_by, to_replace, allowed_sort_attrs)

can_read_all_dags = get_auth_manager().is_authorized_dag(method="GET")

if not can_read_all_dags:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we need to keep something like this unfortunately - otherwise the performance of the query plummets if there are a large number of dags in the deployment.

(We didn't originally have this in the first few releases of per dag access control, and had to add it)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:'(

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes. This could be mostly optimisation. I think it's a viable case to have a special case for "ALL" Dags. Simillarly in the future when we get "multi-team" support, we will have to introduce "ALL DAGS FOR TEAM" as a single value - we do not want to keep list of dag_ids that the user can have access to.

In fact that brings me to another case. In case of non-FAB we do not have the notion of allowing access to particular DAGs at all. This is all driven by "whatever the auth manager returns". So we could introduce it as explicit API of auth manager:

  • get_permitted_dags = "List of dags I have access to" | ALL | NONE | TEAM (TEAM in the future). That will automatically allow Auth Manager to handle it in the right "optimized" way

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That is fair. I could do that. That way we would use get_permitted_dags to figure whether a user has access to all DAGs and we could use is_authorized_dag("GET") to check whether a user is allowed to list DAGs

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I updated the PR to implement this proposal. Could you take a look please?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Friendly reminder :) Is this the direction we want to go?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Works for me.

@vincbeck vincbeck force-pushed the vincbeck/auth_manager_clean branch 4 times, most recently from 5f895a8 to b26b4ea Compare October 8, 2024 14:21
@vincbeck vincbeck added the legacy api Whether legacy API changes should be allowed in PR label Oct 8, 2024
@vincbeck vincbeck closed this Oct 8, 2024
@vincbeck vincbeck reopened this Oct 8, 2024
@vincbeck vincbeck force-pushed the vincbeck/auth_manager_clean branch 5 times, most recently from 764919e to 460ea9b Compare October 8, 2024 22:30
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area:API Airflow's REST/HTTP API area:providers area:webserver Webserver related Issues legacy api Whether legacy API changes should be allowed in PR provider:fab
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants