Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Why not use @tanstack/query for the react hooks? #46

Open
Sheraff opened this issue Dec 10, 2023 · 11 comments
Open

Why not use @tanstack/query for the react hooks? #46

Sheraff opened this issue Dec 10, 2023 · 11 comments

Comments

@Sheraff
Copy link

Sheraff commented Dec 10, 2023

I'm working on making a "starter template" for offline-first PWAs and I was wondering if there was a reason to have gone full-custom for the query hooks? The @tanstack/query library is pretty mature and nice to use.

If I were to give the sales pitch, here's what I could say

  • it's framework-agnostic™ (I never tried anything other than the react implementation)
  • it's well typed
  • it's well known, people are likely to be already familiar with it
  • it provides a global cache manager for de-duping simultaneous queries / invalidating
  • it does a deep merge of its results to minimize re-renders
  • it provides a couple utilities (focus manager, online manager) that can be useful for "offline-conscious" apps
  • it has hooks more patterns (mutations, infinite / paginated queries, react suspense)
  • it has devtools
  • it has good documentation

On the other hand

  • it's quite a good amount of code (13.7kB gzipped), though for an app that already loads 2Mb of wasm it should be ok
  • I'm not exactly sure if it's as fast as your current implementation

I tried a quick implementation of useQuery, here's what that could look like:

import { useQuery, useQueryClient, hashKey } from "@tanstack/react-query"
import { useDB, type CtxAsync } from "@vlcn.io/react"
import { useEffect, useState, useRef } from "react"

type DBAsync = CtxAsync["db"]

type UpdateType =
	/** INSERT */
	| 18
	/** UPDATE */
	| 23
	/** DELETE */
	| 9

/**
 * Not really useful, this is just to increase the cache hit rate.
 */
function sqlToKey(sql: string) {
	return sql.replace(/\s+/g, " ")
}

/**
 * Rely on react-query's cacheManager to
 * - know which queries are active
 * - force invalidation of "currently in-use" queries
 *
 * Rely on vlcn RX to
 * - know which tables are used by a query
 * - know when to invalidate queries
 */
export function useCacheManager(dbName: string) {
	const [tables, setTables] = useState<string[]>([])
	const queryMap = useRef(
		new Map<
			string,
			{
				tables: string[]
				updateTypes: UpdateType[]
				queryKey: unknown[]
			}
		>(),
	)

	const client = useQueryClient()
	const ctx = useDB(dbName)

	useEffect(() => {
		/** not the cleanest implementation, could execute less code if it was outside of react */
		const cleanup = tables.map((table) =>
			ctx.rx.onRange([table], (updates) => {
				queryMap.current.forEach((query) => {
					if (!query.tables.includes(table)) return
					if (!updates.some((u) => query.updateTypes.includes(u))) return
					client.invalidateQueries({ queryKey: query.queryKey, exact: true })
				})
			}),
		)
		return () => {
			for (const dispose of cleanup) {
				dispose()
			}
		}
	}, [tables])

	useEffect(() => {
		const cacheManager = client.getQueryCache()
		/** count how many queries are relying on each table */
		const tableCountMap = new Map<string, number>()

		const cacheUnsubscribe = cacheManager.subscribe((event) => {
			if (event.type === "observerAdded") {
				const key = hashKey(event.query.queryKey)
				if (queryMap.current.has(key)) return
				/** add to Map early, so we know if it has been deleted by `observerRemoved` before `usedTables` resolves */
				queryMap.current.set(key, {
					updateTypes: event.query.queryKey[2],
					queryKey: event.query.queryKey,
					tables: [],
				})
				usedTables(ctx.db, event.query.queryKey[1]).then((usedTables) => {
					const query = queryMap.current.get(key)
					if (!query) return
					queryMap.current.set(key, { ...query, tables: usedTables })
					for (const table of usedTables) {
						if (!tableCountMap.has(table)) {
							tableCountMap.set(table, 1)
							setTables((tables) => [...tables, table])
						} else {
							tableCountMap.set(table, tableCountMap.get(table)! + 1)
						}
					}
				})
			} else if (event.type === "observerRemoved") {
				const key = hashKey(event.query.queryKey)
				const query = queryMap.current.get(key)
				if (!query) return
				queryMap.current.delete(key)
				for (const table of query.tables) {
					if (!tableCountMap.has(table)) continue
					const count = tableCountMap.get(table)!
					if (count === 1) {
						tableCountMap.delete(table)
						setTables((tables) => tables.filter((t) => t !== table))
					} else {
						tableCountMap.set(table, count - 1)
					}
				}
			}
		})
		return cacheUnsubscribe
	}, [])
}

let queryId = 0

export function useDbQuery<
	TQueryFnData = unknown,
	// TError = DefaultError, // TODO
	TData = TQueryFnData[],
>({
	dbName,
	query,
	select,
	bindings = [],
	updateTypes = [18, 23, 9],
}: {
	dbName: string
	query: string
	select?: (data: TQueryFnData[]) => TData
	bindings?: ReadonlyArray<string>
	updateTypes?: Array<UpdateType>
}) {
	const ctx = useDB(dbName)
	const queryKey = [dbName, sqlToKey(query), updateTypes, bindings]

	return useQuery({
		queryKey,
		queryFn: async () => {
			const statement = await ctx.db.prepare(query)
			statement.bind(bindings)
			const [releaser, transaction] = await ctx.db.imperativeTx()
			const transactionId = queryId++
			transaction.exec(/*sql*/ `SAVEPOINT use_query_${transactionId};`)
			statement.raw(false)
			try {
				const data = (await statement.all(transaction)) as TQueryFnData[]
				transaction.exec(/*sql*/ `RELEASE use_query_${transactionId};`).then(releaser, releaser)
				return data
			} catch (e) {
				transaction.exec(/*sql*/ `ROLLBACK TO use_query_${transactionId};`).then(releaser, releaser)
				throw e
			}
		},
		select,
	})
}

async function usedTables(db: DBAsync, query: string): Promise<string[]> {
	const sanitized = query.replaceAll("'", "''")
	const rows = await db.execA(/*sql*/ `
		SELECT tbl_name
		FROM tables_used('${sanitized}') AS u
		JOIN sqlite_master ON sqlite_master.name = u.name
		WHERE u.schema = 'main';
	`)
	return rows.map((r) => r[0])
}

If you're curious, here's the "starter template" I'm talking about https://github.com/Sheraff/root

@tantaman
Copy link
Contributor

tantaman commented Dec 11, 2023

I had thought about it but never took the time to look into it.

The one thing I was worried about was that it looks like @tanstack/query requires manual invalidations of queries.

Looks like you solved that problem in useCacheManager though :)

Some ways you could improve your current implementation:

  • db.prepare returns a prepared statement that you must finalize or else you'll leak memory. I don't see where you finalize it currently.
  • db.prepare is actually somewhat expensive and can take 50-75% of a query's total time. You should re-use prepared statements. Your queryFn re-prepares them each time.
  • usedTables exists as a prepared statement on the db class now -- https://github.com/vlcn-io/js/blob/main/packages/crsqlite-wasm/src/DB.ts#L67-L73 so you can just grab a reference to that rather than preparing it youself. It is a rather expensive query in terms of time it takes to prepare. Quick to run after that.

Something that might make all this easier for you is that I've been looking into automatically caching prepared statements in the DB class itself so users don't have to worry about it. This is a common thing done in many language bindings for SQLite.

it's quite a good amount of code (13.7kB gzipped), though for an app that already loads 2Mb of wasm it should be ok

WASM size and JS size are not exactly equivalent. WASM, since it is already byte code and supports streaming compilation, can actually start executing as soon as the final byte finishes downloading. JS, on the other hand, takes quite a while for low-end devices to parse, compile and start.

E.g., 1MB of JS can take 2 seconds to compile and start after it is downloaded. 1MB of WASM starts as soon as download completes.

Old but useful article: https://medium.com/reloading/javascript-start-up-performance-69200f43b201

I wonder if @tanstack/query is actually that large after tree-shaking?

@Sheraff
Copy link
Author

Sheraff commented Dec 11, 2023

Thanks for your response

db.prepare returns a prepared statement that you must finalize or else you'll leak memory.

good catch, will fix this

You should re-use prepared statements. Your queryFn re-prepares them each time.

the queryFn is only re-executed if the query needs to re-run, so at least it's not on every render. But fair point. An in-memory cache would be quite easy to make, and maybe I can listen to react-query's cache manager to purge the statement when the data itself is purged from the cache (this duration is configurable when instantiating react-query, and it's only purged if the query isn't currently in use).

usedTables exists as a prepared statement on the db class now

Yoooo that's dope! I'm still on v0.15 for now, wasn't sure whether 0.16 was stable.

I wonder if @tanstack/query is actually that large after tree-shaking?

I'm not sure I'm testing this right, but I just tried running a Vite build on my repo (the only thing I do with react-query is what is pasted above) with the rollup-plugin-visualizer plugin, and the output is about the same as what is reported on bundlephobia. So that "13.7kB gzipped" is probably about right.

Screenshot 2023-12-11 at 15 58 13

@tantaman
Copy link
Contributor

The SQLite team is considering pushing the statement cache down into SQLite itself -- https://sqlite.org/forum/forumpost/d5939a8557a7a85f

I really hope they do this.

wasn't sure whether 0.16 was stable.

it is as of 0.16.2-next. I need to update some docs before making it the official release.

Notable changes are:

  • Primary key columns must be tagged as NOT NULL
  • site_id is now required so fetching local changes requires a comparison on site_id != crsql_site_id() rather than site_id IS NOT NULL

@Sheraff
Copy link
Author

Sheraff commented Dec 12, 2023

Hey I was trying to make a more production-ready version of this "react-query adapter" above, and I had a question : is there a cost to keep listening for changes (rx.onRange([table], ...) on a table that is not changing? Or is it just the slight memory overhead (listener closure) + an entry in a hashmap somewhere (listener registry)?

@tantaman
Copy link
Contributor

tantaman commented Dec 12, 2023

Minimal overhead. Just the extra memory as you mention.

If all listeners were removed entirely there's potentially a speedup since we can reduce the number of times we have to cross the WASM <-> JS bridge which can be expensive.

To that end, it'd be an improvement if we gathered updates in WASM and only called into JS once on commit of the transaction. Related: #48

@kimmobrunfeldt
Copy link

@Sheraff I was setting up dependencies for a project and also ended up with using @tanstack/react-query setup.

I didn't yet understand why the hook needs to be so complex? I'm quite sure I'm missing something. For example the comment lists how to know which queries are active and how to invalidate those, but why is this required?

/**
 * Rely on react-query's cacheManager to
 * - know which queries are active
 * - force invalidation of "currently in-use" queries
 *
 * Rely on vlcn RX to
 * - know which tables are used by a query
 * - know when to invalidate queries
 */

My first (probably naive) thought was to simply do:

import { useQuery } from '@tanstack/react-query'
import { useDB } from '@vlcn.io/react'
import sql from 'sql-template-tag'

export function Component() {
  const ctx = useDB('dbName')

  const threshold = 10
  const sqlQuery = sql`SELECT * FROM my_table WHERE some_value > ${threshold}`
  const {
    data: rows,
    isLoading,
    isError,
  } = useQuery({
    queryKey: ['sqlQuery', sqlQuery.sql, sqlQuery.values],
    queryFn: async () => {
      // Not sure how to represent ctx as a dependency for the query
      const rows = await ctx.db.execO<{ id: string; name: string }>(
        sqlQuery.sql,
        sqlQuery.values
      )
      return rows
    },
  })

  if (isLoading) return <div>Loading...</div>
  if (isError) return <div>Error</div>

  if (!rows) throw new Error('No data') // should not happen
  return (
    <ul>
      {rows.map((row) => (
        <li key={row.id}>{row.name}</li>
      ))}
    </ul>
  )
}

@Sheraff
Copy link
Author

Sheraff commented Apr 12, 2024

@kimmobrunfeldt 99% of the complexity is for reactive queries. What I wanted to make (and what vlcn.io/react provides out of the box) is queries that will re-render the component when their value changes — either from a local mutation, or from a mutation that happened on a remote copy of the DB and that was synced through the CRDT mechanism.

What you wrote here is completely fine, but there is no reactivity, and if you want the query to re-execute and re-render, you'll have to explicitly call queryClient.incalidateQueries({ queryKey: ['sqlQuery', sqlQuery.sql, sqlQuery.values] }) with the proper SQL query, at the proper time.


Also, even though it's not really the subject, to answer your remark "Not sure how to represent ctx as a dependency for the query": you can add ctx.db.db to your query key, AFAIK it's a unique number representing your DB.

@kimmobrunfeldt
Copy link

@Sheraff thanks for the quick reply. Got it, now everything makes sense 👍

Also, even though it's not really the subject, to answer your remark "Not sure how to represent ctx as a dependency for the query": you can add ctx.db.db to your query key, AFAIK it's a unique number representing your DB.

👏 thank you!

@kimmobrunfeldt
Copy link

By ctx.db.db I assume you meant ctx.db.siteid

@Sheraff
Copy link
Author

Sheraff commented Apr 13, 2024

By ctx.db.db I assume you meant ctx.db.siteid

I meant ctx.db.db but maybe siteid does the job just as well.

@kimmobrunfeldt
Copy link

Ok right, the .db wasn't exposed via types

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants