Skip to content

Internals: backtracking

David Tolnay edited this page Jul 13, 2015 · 8 revisions

The jq language is a generator/backtracking-type functional language. Expressions produce zero, one, or more results. Producing zero results is the same backtracking, and as a result the empty builtin triggers backtracking.

When backtracking, the VM unwinds the stack, freeing garbage as it goes, up to a saved point (a "fork point"), restarting at that point. The resumed instruction knows that a backtrack has occurred and can do a variety of things as a result, such as: produce the next result in some low-level operation (e.g., EACH, for .[], INDEX for .[expr]). Many built-in generators work by catching backtracking and producing the next value until there are no more values to produce, in which case they backtrack.

Fundamental generators in jq:

  • the comma operator (FORK opcode)
  • .[] (iterate values of .) (EACH opcode)
  • range(init;upto) (RANGE opcode)
  • the path() builtin (PATH_BEGIN and PATH_END opcodes)
  • the last expression, the one that outputs values to the calling C program (RET opcode)

Generators can be implemented in terms of existing jq language functionality. For example, the new range(init;upto;by) builtin is a jq-coded function, using only:

  • the comma operator for generation
  • tail-call optimized recursion for iteration
  • sub-functions (yes, jq functions can define sub-functions!)
Clone this wiki locally