Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

chore(federation): borrow selection keys #6074

Merged
merged 18 commits into from
Nov 11, 2024

Conversation

goto-bus-stop
Copy link
Member

@goto-bus-stop goto-bus-stop commented Sep 27, 2024

This PR changes the implementation of SelectionMap so it does not have owned key data.

I'm opening this already so I can share it but I would like to improve the PR text a bit after our standup 😛 I don't think we'll have time to review this in the next days anyways!

Previously, to use SelectionMap, each selection type provided a key() -> SelectionKey method, returning some owned data, in practice two Arcs. Keys are copied frequently. However, key data must always already exist in the selection itself, so this is quite wasteful.

Selection keys now borrow from their Selection. They are Copy, so there is no overhead1 in creating or dropping them. They are meant to be created whenever they're necessary, not stored.

Because keys are now "free" to create, there is no need to cache them in Field, FragmentSpread, and InlineFragment structures. As a result, there is now also no need for the Field/FieldData split. The Field/FieldData data structures existed to disallow arbitrary mutable access to FieldData that may affect the cached key in the Field wrapper. This limitation is lifted, as keys are now only used by SelectionMap: you can freely mutate a Field or a FieldSelection, as long as it is not in a SelectionMap. SelectionMap only hands out mutable access through FieldSelectionValue, which enforces the mutability constraints.

I found this to improve performance across the board by 1-3%. That's not super impressive, but not bad either for something that affects every plan. I think this is also a stepping stone towards having mutable selections/selection sets so we won't need to copy so much.

Review guidance

There's a lot of code changes here. I tried to group the commit history into grokkable chunks. Note I didn't keep it compiling between commits necessarily 🙈

  • Change the SelectionMap implementation to not store SelectionKeys
    • This uses hashbrown's HashTable and a Vec instead of IndexMap. The implementation is similar to IndexMap, except that keys are derived as necessary.
    • The main work is in 42489d1
  • Change SelectionKey to be borrowed
    • Remove the iter() and iter_mut() iterators from SelectionMap: de5ca95
      • iter_mut is not possible to implement with borrowed keys because it would have to yield (SelectionKey<'a>, SelectionValue<'a>), i.e. an immutable and a mutable reference to the same selection. The methods are unnecessary as you can trivially derive the key from the value.
    • The actual migration: 6a8924d
    • There is an OwnedSelectionKey type that's used in selection set merging, the way that's written requires an owned key structure, that can be adjusted in the future but I decided to keep it like this also because we recently had to revert a change to this code
  • Combining the selection "Data" and wrapper structures:
    • Adjust InlineFragment::can_rebase_on and InlineFragmentData::can_rebase_on so the names won't collide after the merge 68f3ece
    • Combine Field and FieldData into a single type Field: 4e43243
    • InlineFragment: c2da90c
    • FragmentSpread: 1cf8157

Footnotes

  1. I mean, it's still some memory being moved around, but it's not very substantial.

Copy link
Contributor

@goto-bus-stop, please consider creating a changeset entry in /.changesets/. These instructions describe the process and tooling.

@router-perf
Copy link

router-perf bot commented Sep 27, 2024

CI performance tests

  • connectors-const - Connectors stress test that runs with a constant number of users
  • const - Basic stress test that runs with a constant number of users
  • demand-control-instrumented - A copy of the step test, but with demand control monitoring and metrics enabled
  • demand-control-uninstrumented - A copy of the step test, but with demand control monitoring enabled
  • enhanced-signature - Enhanced signature enabled
  • events - Stress test for events with a lot of users and deduplication ENABLED
  • events_big_cap_high_rate - Stress test for events with a lot of users, deduplication enabled and high rate event with a big queue capacity
  • events_big_cap_high_rate_callback - Stress test for events with a lot of users, deduplication enabled and high rate event with a big queue capacity using callback mode
  • events_callback - Stress test for events with a lot of users and deduplication ENABLED in callback mode
  • events_without_dedup - Stress test for events with a lot of users and deduplication DISABLED
  • events_without_dedup_callback - Stress test for events with a lot of users and deduplication DISABLED using callback mode
  • extended-reference-mode - Extended reference mode enabled
  • large-request - Stress test with a 1 MB request payload
  • no-tracing - Basic stress test, no tracing
  • reload - Reload test over a long period of time at a constant rate of users
  • step-jemalloc-tuning - Clone of the basic stress test for jemalloc tuning
  • step-local-metrics - Field stats that are generated from the router rather than FTV1
  • step-with-prometheus - A copy of the step test with the Prometheus metrics exporter enabled
  • step - Basic stress test that steps up the number of users over time
  • xlarge-request - Stress test with 10 MB request payload
  • xxlarge-request - Stress test with 100 MB request payload

}

impl HasSelectionKey for FragmentSpread {
fn key(&self) -> SelectionKey<'_> {
if is_deferred_selection(&self.directives) {
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a tradeoff right now: we call key() much more frequently, and for inline fragments and fragment spreads it has to search through the directive list. In many cases it'll be free (because the directive list is None), but in some cases it will have to walk the list and do a few string comparisons. It could potentially be optimized by storing a .has_defer boolean in DirectiveList, but I didn't see it on flame graphs yet, so that could be overengineering it.

@@ -206,7 +206,7 @@ impl SelectionSet {
));
};
fields
.entry(other_key)
.entry(other_key.to_owned_key())
Copy link
Member Author

@goto-bus-stop goto-bus-stop Sep 30, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is not easy to rewrite to borrowed keys, because it creates intermediate maps grouping selections that have the same key. We can probably rewrite merging without the intermediate maps entirely later. Note the previous version was also cloning keys here, so it's not a regression.

Comment on lines +714 to +728
pub(crate) fn schema(&self) -> &ValidFederationSchema {
&self.schema
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Something to consider: We could derive these via the derive_getters crate. This would as a single transitive dependency to the router (i.e. just the added crate).

apollo-federation/src/operation/selection_map.rs Outdated Show resolved Hide resolved
Comment on lines +390 to +403
/// Add selections from another selection map to this one. If there are key collisions, the
/// selections are *overwritten*.
pub(crate) fn extend(&mut self, other: SelectionMap) {
for selection in other.into_values() {
self.insert(selection);
}
}

/// Add selections from another selection map to this one. If there are key collisions, the
/// selections are *overwritten*.
pub(crate) fn extend_ref(&mut self, other: &SelectionMap) {
for selection in other.values() {
self.insert(selection.clone());
}
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a reason to have this over implementing the extend trait directly?

@goto-bus-stop
Copy link
Member Author

I'll update this to use hashbrown::HashTable over hashbrown::raw::RawTable. Drafting until then

@goto-bus-stop goto-bus-stop marked this pull request as draft October 2, 2024 14:47
@goto-bus-stop goto-bus-stop force-pushed the renee/selection-map-fleeting-key branch from d9e239f to 2613cd1 Compare October 14, 2024 13:48
@svc-apollo-docs
Copy link
Collaborator

svc-apollo-docs commented Oct 14, 2024

✅ Docs Preview Ready

No new or changed pages found.

@goto-bus-stop goto-bus-stop force-pushed the renee/selection-map-fleeting-key branch from 2613cd1 to 9c9702d Compare October 14, 2024 14:10
@goto-bus-stop
Copy link
Member Author

Rebased, tried to rewrite commit history to make a little more sense, will write a new PR description and fix linting before undrafting for review.

@goto-bus-stop goto-bus-stop force-pushed the renee/selection-map-fleeting-key branch from e7432d8 to 354beb2 Compare October 15, 2024 13:32
@goto-bus-stop goto-bus-stop marked this pull request as ready for review October 15, 2024 13:42
@dariuszkuc
Copy link
Member

Very nice cleanup!

Copy link
Member

@abernix abernix left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

license addition LGTM.

@goto-bus-stop goto-bus-stop merged commit ffec3f9 into dev Nov 11, 2024
15 checks passed
@goto-bus-stop goto-bus-stop deleted the renee/selection-map-fleeting-key branch November 11, 2024 09:20
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants