Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

resetStore and refetchQueries can cause to refetch unmounted/inactive queries in StrictMode #11914

Open
Cellule opened this issue Jun 28, 2024 · 18 comments · May be fixed by #11925
Open

resetStore and refetchQueries can cause to refetch unmounted/inactive queries in StrictMode #11914

Cellule opened this issue Jun 28, 2024 · 18 comments · May be fixed by #11925
Labels

Comments

@Cellule
Copy link

Cellule commented Jun 28, 2024

Issue Description

We recently upgraded to version 3
Right now we use resetStore for a particular scenario.
All graphql requests sent have an "organizationId" header associated.
Depending on that "organizationId" the user will see different content.
When the user changes to another organization we navigate to an almost empty page for the transition (to unmount all organization specific queries)
Then we call resetStore to refetch top-level queries that "might" be affected by the change of context.
Once everything resolves, we navigate back to the normal app.

This used to work fine for the last 5 years, then we updated to apollo client version 3.

After some digging we found out that after calling resetStore some queries that are unmounted (confirmed 100%) end up being refetched.
I debugged through the resetStore method, but I found no calls to refetch those queries.
So right now I have no idea what could cause those unmounted queries to be requested once again.

The biggest problem I have right now is that because those queries are called from the wrong context, it causes another mechanism to automatically redirect to the previous organization (so direct links work in any organization)
I have a lot of fail safe in place and it's fine 90% of the time, but when the network is slower those fail safe are not enough and it leads to the user no being able to change organization.

Any guidance on what could cause an unmounted query to hit the network once more after calling resetStore would be appreciated
Or tips on how to find out

Link to Reproduction

https://codesandbox.io/p/devbox/jolly-wilbur-n7m7jz?layout=%257B%2522sidebarPanel%2522%253A%2522EXPLORER%2522%252C%2522rootPanelGroup%2522%253A%257B%2522direction%2522%253A%2522horizontal%2522%252C%2522contentType%2522%253A%2522UNKNOWN%2522%252C%2522type%2522%253A%2522PANEL_GROUP%2522%252C%2522id%2522%253A%2522ROOT_LAYOUT%2522%252C%2522panels%2522%253A%255B%257B%2522type%2522%253A%2522PANEL_GROUP%2522%252C%2522contentType%2522%253A%2522UNKNOWN%2522%252C%2522direction%2522%253A%2522vertical%2522%252C%2522id%2522%253A%2522clwtppgmj00073p6i6i7ie1cc%2522%252C%2522sizes%2522%253A%255B70%252C30%255D%252C%2522panels%2522%253A%255B%257B%2522type%2522%253A%2522PANEL_GROUP%2522%252C%2522contentType%2522%253A%2522EDITOR%2522%252C%2522direction%2522%253A%2522horizontal%2522%252C%2522id%2522%253A%2522EDITOR%2522%252C%2522panels%2522%253A%255B%257B%2522type%2522%253A%2522PANEL%2522%252C%2522contentType%2522%253A%2522EDITOR%2522%252C%2522id%2522%253A%2522clwtppgmj00023p6iumaf2w8i%2522%257D%255D%257D%252C%257B%2522type%2522%253A%2522PANEL_GROUP%2522%252C%2522contentType%2522%253A%2522SHELLS%2522%252C%2522direction%2522%253A%2522horizontal%2522%252C%2522id%2522%253A%2522SHELLS%2522%252C%2522panels%2522%253A%255B%257B%2522type%2522%253A%2522PANEL%2522%252C%2522contentType%2522%253A%2522SHELLS%2522%252C%2522id%2522%253A%2522clwtppgmj00043p6igepxit5p%2522%257D%255D%252C%2522sizes%2522%253A%255B100%255D%257D%255D%257D%252C%257B%2522type%2522%253A%2522PANEL_GROUP%2522%252C%2522contentType%2522%253A%2522DEVTOOLS%2522%252C%2522direction%2522%253A%2522vertical%2522%252C%2522id%2522%253A%2522DEVTOOLS%2522%252C%2522panels%2522%253A%255B%257B%2522type%2522%253A%2522PANEL%2522%252C%2522contentType%2522%253A%2522DEVTOOLS%2522%252C%2522id%2522%253A%2522clwtppgmj00063p6ixzewo28y%2522%257D%255D%252C%2522sizes%2522%253A%255B100%255D%257D%255D%252C%2522sizes%2522%253A%255B69.49598743432026%252C30.504012565679744%255D%257D%252C%2522tabbedPanels%2522%253A%257B%2522clwtppgmj00023p6iumaf2w8i%2522%253A%257B%2522id%2522%253A%2522clwtppgmj00023p6iumaf2w8i%2522%252C%2522activeTabId%2522%253A%2522clwtpsi76003n3p6i7vuk56vl%2522%252C%2522tabs%2522%253A%255B%257B%2522id%2522%253A%2522clwtppgmj00013p6ipicylgfm%2522%252C%2522mode%2522%253A%2522permanent%2522%252C%2522type%2522%253A%2522FILE%2522%252C%2522filepath%2522%253A%2522%252FREADME.md%2522%252C%2522state%2522%253A%2522IDLE%2522%257D%252C%257B%2522id%2522%253A%2522clwtpsi76003n3p6i7vuk56vl%2522%252C%2522mode%2522%253A%2522permanent%2522%252C%2522type%2522%253A%2522FILE%2522%252C%2522initialSelections%2522%253A%255B%257B%2522startLineNumber%2522%253A25%252C%2522startColumn%2522%253A27%252C%2522endLineNumber%2522%253A25%252C%2522endColumn%2522%253A27%257D%255D%252C%2522filepath%2522%253A%2522%252Fsrc%252Flink.js%2522%252C%2522state%2522%253A%2522IDLE%2522%257D%252C%257B%2522type%2522%253A%2522FILE%2522%252C%2522filepath%2522%253A%2522%252F.codesandbox%252Ftasks.json%2522%252C%2522id%2522%253A%2522clwumzut3002e3p6gxrqqie21%2522%252C%2522mode%2522%253A%2522permanent%2522%252C%2522state%2522%253A%2522IDLE%2522%257D%255D%257D%252C%2522clwtppgmj00063p6ixzewo28y%2522%253A%257B%2522id%2522%253A%2522clwtppgmj00063p6ixzewo28y%2522%252C%2522activeTabId%2522%253A%2522clxyzoh8p000l3j6jh401n3i3%2522%252C%2522tabs%2522%253A%255B%257B%2522id%2522%253A%2522clwtppgmj00053p6iqkbdkq6f%2522%252C%2522mode%2522%253A%2522permanent%2522%252C%2522type%2522%253A%2522TASK_PORT%2522%252C%2522taskId%2522%253A%2522start%2522%252C%2522port%2522%253A3000%252C%2522path%2522%253A%2522%252F%2522%257D%252C%257B%2522type%2522%253A%2522SETUP_TASKS%2522%252C%2522id%2522%253A%2522clxyzmn6r000o3j6jbsttjmyy%2522%252C%2522mode%2522%253A%2522permanent%2522%257D%252C%257B%2522type%2522%253A%2522UNASSIGNED_PORT%2522%252C%2522port%2522%253A2222%252C%2522id%2522%253A%2522clxyzn4b5001p3j6jw7ckv2w6%2522%252C%2522mode%2522%253A%2522permanent%2522%257D%252C%257B%2522type%2522%253A%2522TASK_PORT%2522%252C%2522port%2522%253A3000%252C%2522taskId%2522%253A%2522start%2522%252C%2522id%2522%253A%2522clxyzoh8p000l3j6jh401n3i3%2522%252C%2522mode%2522%253A%2522permanent%2522%257D%255D%257D%252C%2522clwtppgmj00043p6igepxit5p%2522%253A%257B%2522id%2522%253A%2522clwtppgmj00043p6igepxit5p%2522%252C%2522activeTabId%2522%253A%2522clwtppgmj00033p6iw9cgtvcz%2522%252C%2522tabs%2522%253A%255B%257B%2522id%2522%253A%2522clwtppgmj00033p6iw9cgtvcz%2522%252C%2522mode%2522%253A%2522permanent%2522%252C%2522type%2522%253A%2522TASK_LOG%2522%252C%2522taskId%2522%253A%2522start%2522%257D%252C%257B%2522id%2522%253A%2522clwtqf2io00a13p6m5m1c3s1n%2522%252C%2522mode%2522%253A%2522permanent%2522%252C%2522type%2522%253A%2522TERMINAL%2522%252C%2522shellId%2522%253A%2522clwtqf2lx0074dke2ga51ahaj%2522%257D%255D%257D%257D%252C%2522showDevtools%2522%253Atrue%252C%2522showShells%2522%253Atrue%252C%2522showSidebar%2522%253Atrue%252C%2522sidebarPanelSize%2522%253A15%257D

Reproduction Steps

  • Go to "Home"
  • Go to "Home2"
  • Click "Reset Store" button
  • You'll notice the query "AllPeople" (only mounted in Home not Home2) will be sent when it should not

@apollo/client version

3.10.4

@Cellule
Copy link
Author

Cellule commented Jun 28, 2024

Here are my findings so far (no repro still -_-)

I have a query on some page and it is set as "dirty" at some point

if (!this.dirty && !equal(oldDiff && oldDiff.result, diff && diff.result)) {
this.dirty = true;
if (!this.notifyTimeout) {
this.notifyTimeout = setTimeout(() => this.notify(), 0);
}
}

Then the query is unmounted, but remains "dirty"

Later we call resetStore which calls reFetchObservableQueries

public resetStore(): Promise<ApolloQueryResult<any>[] | null> {
return Promise.resolve()
.then(() =>
this.queryManager.clearStore({
discardWatches: false,
})
)
.then(() => Promise.all(this.resetStoreCallbacks.map((fn) => fn())))
.then(() => this.reFetchObservableQueries());
}

In reFetchObservableQueries we call broadcastQueries

this.broadcastQueries();

In broadcastQueries we call notify for ALL queries even inactive ones

public broadcastQueries() {
if (this.onBroadcast) this.onBroadcast();
this.queries.forEach((info) => info.notify());
}

In notify for my problematic query, because it is "dirty" we call the listener which ends up calling reobserverCacheFirst on that query causing to send it to the server

this.listeners.add(
(this.oqListener = () => {
const diff = this.getDiff();
if (diff.fromOptimisticTransaction) {
// If this diff came from an optimistic transaction, deliver the
// current cache data to the ObservableQuery, but don't perform a
// reobservation, since oq.reobserveCacheFirst might make a network
// request, and we never want to trigger network requests in the
// middle of optimistic updates.
oq["observe"]();
} else {
// Otherwise, make the ObservableQuery "reobserve" the latest data
// using a temporary fetch policy of "cache-first", so complete cache
// results have a chance to be delivered without triggering additional
// network requests, even when options.fetchPolicy is "network-only"
// or "cache-and-network". All other fetch policies are preserved by
// this method, and are handled by calling oq.reobserve(). If this
// reobservation is spurious, isDifferentFromLastResult still has a
// chance to catch it before delivery to ObservableQuery subscribers.
reobserveCacheFirst(oq);
}
})

So it seems like at some point we should ignore "inactive" queries, aka observableQuery.hasObservers() is false

I think the right place to do this check would be in QueryInfo.shouldNotify

private shouldNotify() {
if (!this.dirty || !this.listeners.size) {
return false;
}

Otherwise, maybe we should NOT call broadcastQueries in the first place from within reFecthObservableQueries ?

@Cellule
Copy link
Author

Cellule commented Jun 28, 2024

Alright after going through this whole effort, I managed to make my minimal repro work!!

@jerelmiller
Copy link
Member

@Cellule thanks for the bug report!

The reproduction is super helpful! I've cloned your reproduction and started looking into this to understand myself. Its close to the weekend and I'm not sure I'll be able to finish until early next week, but thanks so much for your thorough analysis. This will definitely save me a ton of time. I'll get back to you as soon as I can!

@jerelmiller
Copy link
Member

@Cellule looking at this further, your diagnosis seems correct. I had originally assumed there was a bug holding onto the ObservableQuery in the this.queries map inside QueryManager, but that doesn't appear to be the case (hence my assumption in #11894 (comment)).

I can now see that the only reason you're getting the refetch is because the ObservableQuery instance from the first page hasn't yet been garbage collected, so the notify call that you pointed out in your deep dive is triggering the refetch.

I've updated your reproduction to show that once ObservableQuery is garbage collected, we're able to see only a single network request (try clicking the "reset store" button after you see the log message about it getting garbage collected): https://codesandbox.io/p/devbox/fervent-scooby-c5xwk3. This at least confirms that we aren't holding onto that ObservableQuery instance somewhere else in the client 🎉.

I've discussed this with the team to make sure adding a check for hasObservers wouldn't have any other adverse effects, and we couldn't think of any. I think your suggested fix makes sense. I'd like to dig into the proper place for this, but otherwise I think we should move forward with this fix. If you're interested in submitting a PR, I'd be happy to review, otherwise I'll try and get something up within the next couple days.

Thanks again for your thorough deep dive through the code and the time getting a reproduction! This made my life so much easier ❤️.

@Cellule
Copy link
Author

Cellule commented Jul 2, 2024

@jerelmiller Great to hear!
I was about to open a PR then figured I should write a test for this.
However, I can't figure out how to reach the state where there's an inactive ObservableQuery in the QueryManager.
Actually, the more I review the code, the less I understand how it's possible to have an inactive query in the query manager 😅

@jerelmiller
Copy link
Member

Haha right?!?! Thats why I was so confused why you were able to reproduce this because technically it shouldn't be possible 😆.

From what I understand, its the fact that the ObservableQuery instance still exists in memory and hasn't been garbage collected yet, so its able to respond to updates when broadcastQueries is run. The best way I could think to try and approach a test would be:

If you want to test this at the React level with useQuery:

  • Render a component with one query using useQuery.
  • After the fetch is complete, unmount that component and mount another component with another useQuery (a ternary would probably work just fine here). Make sure these are different queries so that query deduplication doesn't kick in giving you a false positive.
  • Execute client.reFetchObservableQueries
  • assert that only one of the two queries is executed.

If you opt for this approach, we have a handy Profiler utility that makes it easier to assert render-by-render. Here is an example of its usage with useQuery:

it("does not return partial data unexpectedly when one query errors, then another succeeds with overlapping data", async () => {
interface Query1 {
person: {
__typename: "Person";
id: number;
firstName: string;
alwaysFails: boolean;
} | null;
}
interface Query2 {
person: { __typename: "Person"; id: number; lastName: string } | null;
}
interface Variables {
id: number;
}
const user = userEvent.setup();
const query1: TypedDocumentNode<Query1, Variables> = gql`
query PersonQuery1($id: ID!) {
person(id: $id) {
id
firstName
alwaysFails
}
}
`;
const query2: TypedDocumentNode<Query2, Variables> = gql`
query PersonQuery2($id: ID!) {
person(id: $id) {
id
lastName
}
}
`;
const Profiler = createProfiler({
initialSnapshot: {
useQueryResult: null as QueryResult<Query1, Variables> | null,
useLazyQueryResult: null as QueryResult<Query2, Variables> | null,
},
});
const client = new ApolloClient({
link: new MockLink([
{
request: { query: query1, variables: { id: 1 } },
result: {
data: { person: null },
errors: [new GraphQLError("Intentional error")],
},
maxUsageCount: Number.POSITIVE_INFINITY,
delay: 20,
},
{
request: { query: query2, variables: { id: 1 } },
result: {
data: { person: { __typename: "Person", id: 1, lastName: "Doe" } },
},
delay: 20,
},
]),
cache: new InMemoryCache(),
});
function App() {
const useQueryResult = useQuery(query1, {
variables: { id: 1 },
// This is necessary to reproduce the behavior
notifyOnNetworkStatusChange: true,
});
const [execute, useLazyQueryResult] = useLazyQuery(query2, {
variables: { id: 1 },
});
Profiler.replaceSnapshot({ useQueryResult, useLazyQueryResult });
return (
<>
<button onClick={() => execute()}>Run 2nd query</button>
<button
onClick={() => {
// Intentionally use reobserve here as opposed to refetch to
// ensure we check against reported cache results with cache-first
// and notifyOnNetworkStatusChange
useQueryResult.observable.reobserve();
}}
>
Reload 1st query
</button>
</>
);
}
render(<App />, {
wrapper: ({ children }) => (
<ApolloProvider client={client}>
<Profiler>{children}</Profiler>
</ApolloProvider>
),
});
{
const { snapshot } = await Profiler.takeRender();
expect(snapshot.useQueryResult).toMatchObject({
data: undefined,
loading: true,
networkStatus: NetworkStatus.loading,
});
expect(snapshot.useLazyQueryResult).toMatchObject({
called: false,
data: undefined,
loading: false,
networkStatus: NetworkStatus.ready,
});
}
{
const { snapshot } = await Profiler.takeRender();
expect(snapshot.useQueryResult).toMatchObject({
data: undefined,
error: new ApolloError({
graphQLErrors: [new GraphQLError("Intentional error")],
}),
loading: false,
networkStatus: NetworkStatus.error,
});
expect(snapshot.useLazyQueryResult).toMatchObject({
called: false,
data: undefined,
loading: false,
networkStatus: NetworkStatus.ready,
});
}
await act(() => user.click(screen.getByText("Run 2nd query")));
{
const { snapshot } = await Profiler.takeRender();
expect(snapshot.useQueryResult).toMatchObject({
data: undefined,
error: new ApolloError({
graphQLErrors: [new GraphQLError("Intentional error")],
}),
loading: false,
networkStatus: NetworkStatus.error,
});
expect(snapshot.useLazyQueryResult).toMatchObject({
called: true,
data: undefined,
loading: true,
networkStatus: NetworkStatus.loading,
});
}
{
const { snapshot } = await Profiler.takeRender();
expect(snapshot.useQueryResult).toMatchObject({
data: undefined,
error: new ApolloError({
graphQLErrors: [new GraphQLError("Intentional error")],
}),
loading: false,
networkStatus: NetworkStatus.error,
});
// ensure we aren't setting a value on the observable query that contains
// the partial result
expect(
snapshot.useQueryResult?.observable.getCurrentResult(false)
).toEqual({
data: undefined,
error: new ApolloError({
graphQLErrors: [new GraphQLError("Intentional error")],
}),
errors: [new GraphQLError("Intentional error")],
loading: false,
networkStatus: NetworkStatus.error,
partial: true,
});
expect(snapshot.useLazyQueryResult).toMatchObject({
called: true,
data: { person: { __typename: "Person", id: 1, lastName: "Doe" } },
loading: false,
networkStatus: NetworkStatus.ready,
});
}
await act(() => user.click(screen.getByText("Reload 1st query")));
{
const { snapshot } = await Profiler.takeRender();
expect(snapshot.useQueryResult).toMatchObject({
data: undefined,
loading: true,
networkStatus: NetworkStatus.loading,
});
expect(snapshot.useLazyQueryResult).toMatchObject({
called: true,
data: { person: { __typename: "Person", id: 1, lastName: "Doe" } },
loading: false,
networkStatus: NetworkStatus.ready,
});
}
{
const { snapshot } = await Profiler.takeRender();
expect(snapshot.useQueryResult).toMatchObject({
data: undefined,
loading: false,
error: new ApolloError({
graphQLErrors: [new GraphQLError("Intentional error")],
}),
networkStatus: NetworkStatus.error,
});
// ensure we aren't setting a value on the observable query that contains
// the partial result
expect(
snapshot.useQueryResult?.observable.getCurrentResult(false)
).toEqual({
data: undefined,
error: new ApolloError({
graphQLErrors: [new GraphQLError("Intentional error")],
}),
errors: [new GraphQLError("Intentional error")],
loading: false,
networkStatus: NetworkStatus.error,
partial: true,
});
expect(snapshot.useLazyQueryResult).toMatchObject({
called: true,
data: { person: { __typename: "Person", id: 1, lastName: "Doe" } },
loading: false,
networkStatus: NetworkStatus.ready,
});
}
await expect(Profiler).not.toRerender();
});

If you want to test this with core APIs:

  • Create an observable with client.watchQuery
  • Subscribe to it, let the fetch finish, then unsubscribe from it
  • Create a 2nd observable with client.watchQuery. Ensure its a different query so that query deduplication doesn't kick in giving you a false positive.
  • Subscribe to it, let the fetch finish
  • Execute client.reFetchObservableQueries
  • Assert that only one of the two queries is executed.

If you opt for this approach, we have a handy ObservableStream utility that captures each emitted result that you can assert on in a more synchronous way. Here is an example of its usage:

it("handles multiple calls to getCurrentResult without losing data", async () => {
const query = gql`
{
greeting {
message
... on Greeting @defer {
recipient {
name
}
}
}
}
`;
const link = new MockSubscriptionLink();
const client = new ApolloClient({
link,
cache: new InMemoryCache(),
});
const obs = client.watchQuery({ query });
const stream = new ObservableStream(obs);
link.simulateResult({
result: {
data: {
greeting: {
message: "Hello world",
__typename: "Greeting",
},
},
hasNext: true,
},
});
{
const result = await stream.takeNext();
expect(result.data).toEqual({
greeting: {
message: "Hello world",
__typename: "Greeting",
},
});
}
expect(obs.getCurrentResult().data).toEqual({
greeting: {
message: "Hello world",
__typename: "Greeting",
},
});
expect(obs.getCurrentResult().data).toEqual({
greeting: {
message: "Hello world",
__typename: "Greeting",
},
});
link.simulateResult(
{
result: {
incremental: [
{
data: {
recipient: {
name: "Alice",
__typename: "Person",
},
__typename: "Greeting",
},
path: ["greeting"],
},
],
hasNext: false,
},
},
true
);
{
const result = await stream.takeNext();
expect(result.data).toEqual({
greeting: {
message: "Hello world",
recipient: {
name: "Alice",
__typename: "Person",
},
__typename: "Greeting",
},
});
}
expect(obs.getCurrentResult().data).toEqual({
greeting: {
message: "Hello world",
recipient: {
name: "Alice",
__typename: "Person",
},
__typename: "Greeting",
},
});
expect(obs.getCurrentResult().data).toEqual({
greeting: {
message: "Hello world",
recipient: {
name: "Alice",
__typename: "Person",
},
__typename: "Greeting",
},
});
});

If you're struggling with either of these, let me know and I'd be happy to take a stab at creating a failing test for this to use as a base for implementing the change.

@Cellule
Copy link
Author

Cellule commented Jul 2, 2024

So I'm not 100% sure why, but my current investigation leads me to this bit of code where InternalState.useObservableQuery will call this.client.watchQuery twice for the same query (can be seen on page load of my repro).
The first one created seems to be abandoned, but remains in the list of query of the QueryManager.

private useObservableQuery() {
// See if there is an existing observable that was used to fetch the same
// data and if so, use it instead since it will contain the proper queryId
// to fetch the result set. This is used during SSR.
const obsQuery = (this.observable =
(this.renderPromises &&
this.renderPromises.getSSRObservable(this.watchQueryOptions)) ||
this.observable || // Reuse this.observable if possible (and not SSR)
this.client.watchQuery(this.getObsQueryOptions()));
// eslint-disable-next-line react-hooks/rules-of-hooks
this.obsQueryFields = React.useMemo(
() => ({
refetch: obsQuery.refetch.bind(obsQuery),
reobserve: obsQuery.reobserve.bind(obsQuery),
fetchMore: obsQuery.fetchMore.bind(obsQuery),
updateQuery: obsQuery.updateQuery.bind(obsQuery),
startPolling: obsQuery.startPolling.bind(obsQuery),
stopPolling: obsQuery.stopPolling.bind(obsQuery),
subscribeToMore: obsQuery.subscribeToMore.bind(obsQuery),
}),
[obsQuery]
);
const ssrAllowed = !(
this.queryHookOptions.ssr === false || this.queryHookOptions.skip
);
if (this.renderPromises && ssrAllowed) {
this.renderPromises.registerSSRObservable(obsQuery);
if (obsQuery.getCurrentResult().loading) {
// TODO: This is a legacy API which could probably be cleaned up
this.renderPromises.addObservableQueryPromise(obsQuery);
}
}
return obsQuery;
}

So in the end, it seems it's not an issue of query being cleaned up left in the QueryManager. It's a separate query that was never properly registered that is somehow in the list.
When resetStore is called, that "never alive" query is revived.
So maybe a proper fix would be to avoid this duplicate in the first place since I suspect there might be lots of "uninitialized" queries

@jerelmiller
Copy link
Member

jerelmiller commented Jul 2, 2024

That sounds very familiar to #11837 where I noticed something similar. Are you using React's strict mode by chance? I believe the the issue here was that the strict mode caused useState to run twice, which meant useInternalState was creating 2 instances of InternalState, which ran watchQuery twice:

let [state, updateState] = React.useState(createInternalState);

Take a look at the PR description in #11837. Perhaps you might be seeing something similar. Funny enough, the fix for that one was also doing a check for hasObservers 😆

@Cellule
Copy link
Author

Cellule commented Jul 4, 2024

It does look like StrictMode is the offender in the repro above where useRef in useInternalState is empty on both renders causing to create an ObservableQuery twice
It would point to bad cleanup in useInternalState.

However, I'm hitting this issue in production where StrictMode should do nothing.
Let me see if I can repro without StrictMode

Cellule added a commit to Cellule/apollo-client that referenced this issue Jul 4, 2024
@Cellule
Copy link
Author

Cellule commented Jul 4, 2024

My biggest fear is that there's a memory leak somehow.
I haven't been able to confirm it either in a repro nor in our production application, so I suppose that for now the initial patch should do the trick.
I'll try to monitor afterward in case memory leak occurs

As for testing, I managed to write a test that fails without the fix and passes with it, it's not the cleanest approach, but let's discuss this on the PR directly

@jerelmiller
Copy link
Member

Thanks for the PR! I'll take a look when I'm back on Monday. Have a great weekend!

Cellule added a commit to Cellule/apollo-client that referenced this issue Oct 9, 2024
Cellule added a commit to Cellule/apollo-client that referenced this issue Oct 9, 2024
@Cellule
Copy link
Author

Cellule commented Oct 11, 2024

@jerelmiller
Someone on my team, @lorenzopicoli , found another related issue.
In one of our mutations, we're using the refetchQueries option

const [mutation] = useMutation<ICreateAssetMutationResp, ICreateAssetMutationArgs>(
    createAssetMutation,
    {
      ignoreResults: true,
      refetchQueries: [
        // needed because a sub-asset can be created at any level in the sub-assets page
        // called if in /assets/:assetId/subs, to refresh the descendants count in treeView
        "AssetParentTreeView",
        // called if in /assets/:assetId, to refresh the descendants count in sub-asset tree-view
        "AssetDetailsQuery",
      ],
    }
)

I went down through the code and ended up in QueryManager.getObservableQueries
https://github.com/Cellule/apollo-client/blob/823b26e1073ae9c50d7c70078ea99e758975dab6/src/core/QueryManager.ts#L852-L900
It seems if include is an array of strings, it'll return queries matching the name even if they're not active.
I feel this is a mistake, but I can't 100% confirm. I'll add a proposed fix in my PR for you to take a look

@Cellule
Copy link
Author

Cellule commented Oct 11, 2024

I dug a bit more into it.
From what I can see here's what happen (exacerbated by StrictMode, but possible without)

  • useQuery is called
  • client.watchQuery is called creating an observableQuery
  • React.useEffect() (actually useSyncExternalStore) is called to begin subscribing to the subscription
  • Something causes the component to unmount immediately (like StrictMode).
    The useEffect to start the subscription is never called
    Apparently this is possible when mounting/unmounting component relatively fast
  • Another query mount and goes through the process normally
  • InMemoryCache calls broadcastWatch after getting results from the "second" query, this causes the first query to become dirty
  • And bad state reached

@Cellule
Copy link
Author

Cellule commented Oct 15, 2024

So clearly StrictMode is being a huge offender of this problem.
I can see that the useState initializer function runs twice in StrictMode

function createInternalState(previous?: InternalState<TData, TVariables>) {
verifyDocumentType(query, DocumentType.Query);
const internalState: InternalState<TData, TVariables> = {
client,
query,
observable:
// See if there is an existing observable that was used to fetch the same
// data and if so, use it instead since it will contain the proper queryId
// to fetch the result set. This is used during SSR.
(renderPromises &&
renderPromises.getSSRObservable(makeWatchQueryOptions())) ||
client.watchQuery(
getObsQueryOptions(void 0, client, options, makeWatchQueryOptions())
),
resultData: {
// Reuse previousData from previous InternalState (if any) to provide
// continuity of previousData even if/when the query or client changes.
previousData: previous?.resultData.current?.data,
},
};
return internalState as InternalState<TData, TVariables>;
}
let [internalState, updateInternalState] =
React.useState(createInternalState);

https://react.dev/reference/react/useState#caveats
Since there's no proper "cleanup" for the observable that is created in useState, we always end up with 2 observable for every query.
According to this
facebook/react#20090 (comment)
Creating a watcher/listener in useState is an anti-pattern and that's why it causes such problems in StrictMode

I'll try to come up with a better repro without StrictMode, but I do believe we should figure out and fix that issue in StrictMode as well to respect the rules of React

@jerelmiller
Copy link
Member

@Cellule thanks for that confirmation. I suspected strict mode was the cause of at least one of the cases and agree we should try and do a better job here to avoid he double initialization. Do let me know if you're able to reproduce it without strict mode though. I'd be curious how that is allowed to happen, but my guess is the underlying cause is still very similar (e.g. multiple ObservableQuery instances are created)

@Cellule
Copy link
Author

Cellule commented Oct 17, 2024

Now I'm starting to wonder if the main issue was fixed (aka dead queries in production)
When I initially encountered this issue we were on apollo-client 3.10.4, we are currently on 3.10.8 and I can't seem to repro
I reviewed the diff between those 2 versions and it does look like something changed around how useQuery handles its state
v3.10.4...v3.10.8
Or more specifically from this PR #11851

You probably have more context around that particular change, do you think this solved it ?
If yes, then maybe we just have to focus on fixing the useState(createInternalState) in StrictMode

@jerelmiller
Copy link
Member

@Cellule ok thats good news! That change was made to provide better compatibility with the new React Compiler. We were violating a few rules of React that would have made it difficult to integrate with it. Very possible the move away from the ref and into state helps here.

@Cellule Cellule changed the title resetStore can cause to refetch unmounted/inactive queries resetStore and refetchQueries can cause to refetch unmounted/inactive queries in StrictMode Oct 17, 2024
@Cellule
Copy link
Author

Cellule commented Oct 17, 2024

Great, I updated the title of the issue to better represent the issue.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants