-
Notifications
You must be signed in to change notification settings - Fork 1.6k
X Experimental Benchmarks
This page describes some benchmarks and gives representative timings and "maxrss" (maximum resident set size) statistics.
Each "test" consists of a combination of a task, typically a jq program, and some input data (possibly null). The first test however involves the md5 program, first so that the md5 value of a particular JSON file can be shown, and second to give a reference point for comparison.
Each combination of task and input data is assigned a number, given in the form (N); for example, the first "test" is:
(1) md5 jeopardy.json
This page is organized as follows:
-
the SOURCES sections has one subsection each for DATA and for PROGRAMS;
-
the RESULTS section is organized into GROUPS so that the timings within each group are roughly comparable. Groups are identified by a string such as "Mac OS X (High Sierra) 3GHz 16GB RAM"
In the RESULTS section, the version of jq should be specified according to its tag, e.g. jq-1.5, jq-1.6rc1
Unless otherwise noted, the version of gojq used is 0.12.6.
"jeopardy.json" (aka JEOPARDY_QUESTIONS1.json) [54MB]
- https://gitlab.cern.ch/slac_sandbox/ubjson/-/raw/504d419d3e6a4ab87488fcc750bb79c6f5471491/benchmarks/files/jeopardy/jeopardy.json
- https://web.archive.org/web/20180222052746/https://raw.githubusercontent.com/alicemaz/super_jeopardy/master/JEOPARDY_QUESTIONS1.json
- https://drive.google.com/file/d/0BwT5wj_P7BKXb2hfM3d2RHU1ckE/
- gzipped: http://skeeto.s3.amazonaws.com/share/JEOPARDY_QUESTIONS1.json.gz
Description: https://www.reddit.com/r/datasets/comments/1uyd0t/200000_jeopardy_questions_in_a_json_file
"citylots.json" [181MB]
- https://raw.githubusercontent.com/zemirco/sf-city-lots-json/master/citylots.json
- https://drive.google.com/open?id=1dy6PNgsCBol5xBfLUpXXJ6sOnrUZ-QM_
Description: https://github.com/zemirco/sf-city-lots-json
- https://gist.github.com/pkoppstein/a5abb4ebef3b0f72a6ed (also available at archive.org)
See [*1] for alternative ways to include this module.
def zip(headers):
. headers as $headers
| [$headers, .] | transpose | map({(.[0]): .[1]}) | add ;
def testzip(n):
[range(0;n)] as $row
| $row | zip( $row|map(tostring) ) ;
testzip(1000000) | length
(1) md5 jeopardy.json
MD5 (jeopardy.json) = 2075398fa049b1c00223b2279ca5281d
user 0m0.126s
sys 0m0.025s
maxrss 11341824
(2) length jeopardy.json
jq-1.5 length jeopardy.json
216930
user 0m1.144s
sys 0m0.112s
maxrss 223440896
(2 rq) length jeopardy.json
rq 'map(s)=>{s.length}' < jeopardy.json
216930
user 4.76s
sys 0.27s
maxrss 372486144
(2 gojq) length jeopardy.json
216930
real 0.90
user 0.89
sys 0.12
maxrss 234708992
(2 dasel) length jeopardy.json
dasel --length -f jeopardy.json
216930
user 1.04
sys 0.17
maxrss 317427712
(3) schema.jq jeopardy.json
jq-1.5 -L . --arg nullable true 'include "schema"; schema' jeopardy.json > jeopardy.schema.json
user 7.10s
sys 0.13s
maxrss 223383552
jq-1.6 -L . --arg nullable true 'include "schema"; schema' jeopardy.json > jeopardy.schema.json
user 8.94
sys 0.16
maxrss 223395840
gojq -L . --arg nullable true 'include "schema"; schema' jeopardy.json > jeopardy.schema.json
user 13.98
sys 0.57
maxrss 1193697280
(4) null testzip.jq
jq-1.5 -n testzip.jq
1000000
user 6.11s
sys 0.35s
maxrss 711286784
(5) . jeopardy.json
jq-1.5 . jeopardy.json | wc -l
1952372
user 4.69s
sys 0.12s
maxrss 223350784
(5 rq) . jeopardy.json
rq --format readable id < jeopardy.json | wc -l
1952372
user 21.38s
sys 2.13s
maxrss 381214720
(6) 'select(length==2)' jeopardy.json # --stream
jq-1.5 --stream 'select(length==2)' jeopardy.json | wc -l
10629570
user 0m8.901s
sys 0m0.087s
maxrss 1359872
(7) null 0
jq-1.5 -n 0
user 0.002924s
sys 0.001339s
maxrss 1187840
Times are based on 1000 iterations using a bash loop, after adjusting for the times of the looping itself.
jq-1.6rc1 -n 0
user: 0.030609s
sys : 0.001838s
maxrss 2076672
Times are based on 1000 iterations using a bash loop, after adjusting for the times of the looping itself.
(8) md5 citylots.json
md5 citylots.json
MD5 (citylots.json) = 158346af5a90253d8b4390bd671eb5c5
user 0.43s
sys 0.06s
maxrss 11333632
(9) length citylots.json
jq-1.5 length citylots.json
2
user 0m6.887s
sys 0m0.772s
maxrss 1375858688
(10) '.features|length' citylots.json
jq-1.5 '.features|length' citylots.json
206560
user 6.23s
sys 0.78s
maxrss 1375899648
(11) schema.jq citylots.json
jq-1.6 -L . --argjson nullable true 'include "schema"; schema' citylots.json > citylots.schema.json
user 58.47
sys 0.97
maxrxx 1376256000
maxrss 1375961088
(12) .features[10000].properties.LOT_NUM citylots.json
jq-1.5 '.features[10000].properties.LOT_NUM' citylots.json
"091"
user 6.44s
sys 0.97s
maxrss 1371561984
jq-1.6rc1 '.features[10000].properties.LOT_NUM' citylots.json
"091"
user 5.46
sys 0.73
maxrss 1375936512
jq-1.5 -n --stream 'first(inputs | select(.[0] == ["features",10000,"properties","LOT_NUM"])) | .[1]' citylots.json
"091"
user 0.60s
sys 0.00s
maxrss 2084864
"jeopardy.schema.json"
{
"air_date": "string",
"answer": "string",
"category": "string",
"question": "string",
"round": "string",
"show_number": "string",
"value": "string"
}
"citylots.schema.json"
{
"type": "string",
"features": [
{
"geometry": {
"coordinates": [
[
[
"JSON"
]
]
],
"type": "string"
},
"properties": {
"BLKLOT": "string",
"BLOCK_NUM": "string",
"FROM_ST": "string",
"LOT_NUM": "string",
"MAPBLKLOT": "string",
"ODD_EVEN": "string",
"STREET": "string",
"ST_TYPE": "string",
"TO_ST": "string"
},
"type": "string"
}
]
}
[*1]
These examples use jq's -L option to specify that the module file schema.jq is in the present working directory (pwd).
If the module file is not in the pwd, then one possibility would be to specify the directory using the -L option.
An alternative would be to omit the -L option and to specify the directory in the include
directive instead, as for example:
jq --arg nullable true 'include "schema" {search: "."}; schema' jeopardy.json
For further details about using include
, see the jq documentation.
An alternative way to use schema.jq would be to uncomment the very last line (i.e., so it reads schema
),
and then invoke jq or gojq with the -f option, e.g. as follows:
jq --arg nullable true -f schema.jq INPUTFILE
where INPUTFILE is the input file.
- Home
- FAQ
- jq Language Description
- Cookbook
- Modules
- Parsing Expression Grammars
- Docs for Oniguruma Regular Expressions (RE.txt)
- Advanced Topics
- Guide for Contributors
- How To
- C API
- jq Internals
- Tips
- Development