Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update INDEXING.md #1022

Merged
merged 1 commit into from
Oct 2, 2024
Merged
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
15 changes: 13 additions & 2 deletions docs/INDEXING.md
Original file line number Diff line number Diff line change
@@ -1,6 +1,17 @@
# Indexing
Tanagra can query the source data directly, but **for improved performance, Tanagra generates indexed tables and queries
them instead**. The indexer config specifies where Tanagra can write generated index tables.

## Need for Indexing

In most cases Tanagra can query the source data directly, but **for improved performance, Tanagra generates indexed
tables and queries them instead**. The indexer config specifies where Tanagra can write generated index tables.

However here are a few scenarios where indexing is strictly required, such as calculating ancestors for
every item in hierarchies based off the parent-child input data. These steps use Dataflow because they
cannot be reasonably simplified to SQL. For most things it's for performance reasons, though some of those
(e.g. calculating rollup counts) would be slow enough to be completely unusable without it.

Another consideration is that performance directly correlates to cost in many cases, either because it
simplifies queries or allows the BQ tables to be optimized (e.g. clustering for common columns).

**Generating index tables is part of the deployment process**; It is not managed by the service. There is a basic
command line interface to run the indexing jobs. Currently, this CLI just uses Gradle's application plugin, so the
Expand Down