Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CORE-137 Optimize filter resources #1577

Merged
merged 7 commits into from
Nov 8, 2024
Merged

CORE-137 Optimize filter resources #1577

merged 7 commits into from
Nov 8, 2024

Conversation

dvoet
Copy link
Collaborator

@dvoet dvoet commented Nov 2, 2024

Ticket: https://broadworkbench.atlassian.net/browse/CORE-137

What:

The query implemented in PostgresAccessPolicyDAO.filterPrivateResources has been highlighted by query insights as far and away the biggest load by time on the database. It is called on average 1.1 million times per day! It is the query behind the api to list user resources. Around the beginning of October, we also noticed in dev that this query was causing stability issues because the AoU user had access to too many google-project resources. Those were cleaned up but the vulnerability remains.

How:

There were 2 main observations:

  1. including actions associated to roles for each resource in the query results multiplied the number of rows returned, for the AoU case there were millions. But the role to action mapping is static from Sam config and is already loaded in memory. Moving all this extra data around is wasteful and is probably leading to excess garbage collection.
  2. removing the joins to figure out auth domain information sped up the AoU queries. Prior versions of this query loaded auth domains separately especially because not all resource types are affected by them. Revert to that.

Overall, the workhorse query is much smaller in terms of joins.


PR checklist

  • I've followed the instructions if I've made any changes to the API, especially if they're breaking changes
  • I've filled out the Security Risk Assessment (requires Broad Internal network access) and attached the result to the JIRA ticket

Comment on lines -1782 to +1781
private val publicResourcesCache: java.util.Map[ResourceTypeName, Seq[FilterResourcesResult]] =
Collections.synchronizedMap(new PassiveExpiringMap(1, TimeUnit.HOURS))
private val publicResourcesCache: ConcurrentMap[ResourceTypeName, Seq[FilterResourcesResult]] =
Caffeine.newBuilder().expireAfterWrite(1, TimeUnit.HOURS).build[ResourceTypeName, Seq[FilterResourcesResult]]().asMap()
Copy link
Collaborator Author

@dvoet dvoet Nov 5, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I switched to a concurrent hashmap implementation because synchronized maps do not allow concurrent reads. Even if the reads are fast, synchronized can lead to jam ups and interference between different resource types.

@dvoet dvoet changed the title Filter resources better CORE-137 Optimize filter resources Nov 5, 2024
@dvoet dvoet marked this pull request as ready for review November 5, 2024 16:07
@dvoet dvoet requested review from a team, davidangb and marctalbott and removed request for a team November 5, 2024 16:07
@davidangb
Copy link
Contributor

including actions associated to roles for each resource in the query results multiplied the number of rows …. But the role to action mapping is static from Sam config and is already loaded in memory.

If the role:action association exists both as static config and in the db, does this change lead to a possible inconsistency?

  1. A user is granted a role on some resource; the role's actions are persisted to the db for this user
  2. A DSP engineer modifies the role:action mapping in the static config
  3. This list-resources API now returns the updated actions for the role, but some other APIs still rely on the db for actions and return a different result

Copy link
Contributor

@davidangb davidangb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks ok to me. I have to admit I am too unfamiliar with it to be very confident. I get the high-level design change and that sounds good.

Does the contract test failure indicate a real change in functionality?

@@ -116,7 +116,10 @@ class ResourceService(
samRequestContext
)
}
} yield upsertedPolicy
} yield {
logger.info(s"Upserted policy $fullyQualifiedPolicyId")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is this intentional or a leftover from development?

@dvoet
Copy link
Collaborator Author

dvoet commented Nov 5, 2024

If the role:action association exists both as static config and in the db, does this change lead to a possible inconsistency?

  1. A user is granted a role on some resource; the role's actions are persisted to the db for this user
  2. A DSP engineer modifies the role:action mapping in the static config
  3. This list-resources API now returns the updated actions for the role, but some other APIs still rely on the db for actions and return a different result

No, step 1 in your list is not how it works. The role is assigned to the user, not the role's actions. The role:action association stored in the database is only update on startup to match what is in config.

@davidangb
Copy link
Contributor

No, step 1 in your list is not how it works. The role is assigned to the user, not the role's actions. The role:action association stored in the database is only update on startup to match what is in config.

Got it - thank you!

@dvoet dvoet requested a review from a team as a code owner November 8, 2024 19:31
Copy link

sonarqubecloud bot commented Nov 8, 2024

@dvoet dvoet merged commit ab2829d into develop Nov 8, 2024
26 checks passed
@dvoet dvoet deleted the filterResources_better branch November 8, 2024 21:34
dvoet added a commit that referenced this pull request Nov 12, 2024
dvoet added a commit that referenced this pull request Nov 12, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants