q benchmark (#241)

harelba · Sep 19, 2020 · 9b492b8 · 9b492b8
1 parent 865f591
commit 9b492b8
Show file tree

Hide file tree

Showing 18 changed files with 835 additions and 4 deletions.
diff --git a/.gitignore b/.gitignore
@@ -12,3 +12,6 @@ packages
 .idea/
 dist/windows/
 generated-site/
+benchmark_data.tar.gz
+_benchmark_data/
+q.egg-info/
diff --git a/VERSION_BUMP.md b/VERSION_BUMP.md
@@ -0,0 +1,18 @@
+
+# Version bump
+Currently, there are some manual steps needed in order to release a new version:
+
+* Make sure that you're in a branch
+* Change the version in the following three files: `bin/q.py`, `setup.py` and `do-manual-release.sh` and commit them to the branch
+* perform merge into master of that branch
+* add a tag of the release version
+* `git push --tags origin master`
+* create a release in github with the tag you've just created
+
+Pushing to master will trigger a build/release, and will push the artifacts to the new release as assets.
+
+The reason for this is related to limitations in the way that pyci uploads the binaries to github.
+
+#
+
+TBD - Continue with the flow of wrapping the artifacts with rpm/deb, copying the files to packages-for-q, and updating the web site.
diff --git a/bin/q.py b/bin/q.py
@@ -33,7 +33,7 @@
 
 from collections import OrderedDict
 
-q_version = '2.0.17'
+q_version = '2.0.18'
 
 __all__ = [ 'QTextAsData' ]
 

diff --git a/do-manual-release.sh b/do-manual-release.sh
@@ -2,7 +2,7 @@
 
 set -e
 
-VERSION=2.0.17
+VERSION=2.0.18
 
 if [[ "$TRAVIS_BRANCH" != "master" ]]
 then

diff --git a/requirements.txt b/requirements.txt
@@ -1,2 +1,3 @@
 six==1.11.0
 flake8==3.6.0
+setuptools<45.0.0
diff --git a/setup.py b/setup.py
@@ -2,7 +2,7 @@
 
 from setuptools import setup
 
-q_version = '2.0.17'
+q_version = '2.0.18'
 
 setup(
     name='q',

diff --git a/test/BENCHMARK.md b/test/BENCHMARK.md
@@ -0,0 +1,159 @@
+
+
+NOTE: *Please don't use or publish this benchmark data yet. See below for details*
+
+# Overview
+This just a preliminary benchmark, originally created for validating performance optimizations and suggestions from users, and analyzing q's move to python3. After writing it, I thought it might be interesting to test its speed against textql and octosql as well.
+
+The results I'm getting are somewhat surprising, to the point of me questioning them a bit, so it would be great to validate the further before finalizing the benchmark results.
+
+The most surprising results are as follows:
+* python3 vs python2 - A huge improvement (for large files, execution times with python 3 are around 40% of the times for python 2)
+* python3 vs textql (written in golang) - Seems that textql becomes slower than the python3 q version as the data sizes grows (both rows and columns)
+
+I would love to validate these results by having other people run the benchmark as well and send me their results. 
+
+If you're interested, follow the instructions and run the benchmark on your machine. After the benchmark is finished, send me the final results file, along with some details about your hardware, and i'll add it to the spreadsheet. <harelba@gmail.com>
+
+I've tried to make running the benchmark as seamless as possible, but there obviously might be errors/issues. Please contact me if you encounter any issue, or just open a ticket.
+
+# Benchmark
+This is an initial version of the benchmark, along with some results. The following is compared:
+* q running on multiple python versions
+* textql 2.0.3
+* octosql v0.3.0
+
+The specific python versions which are being tested are specified in `benchmark-config.sh`.
+
+This is by no means a scientific benchmark, and it only focuses on the data loading time which is the only significant factor for comparison (e.g. the query itself is a very simple count query). Also, it does not try to provide any usability comparison between q and textql/octosql, an interesting topic on its own.
+
+## Methodology
+The idea was to compare the time sensitivity of row and column count. 
+
+* Row counts: 1,10,100,1000,10000,100000,1000000
+* Column counts: 1,5,10,20,50,100
+* Iterations for each combination: 10
+
+File sizes:
+* 1M rows by 100 columns - 976MB (~1GB) - Largest file
+* 1M rows by 50 columns - 477MB
+
+The benchmark executes simple `select count(*) from <file>` queries for each combination, calculating the mean and stddev of each set of iterations. The stddev is used in order to measure the validity of the results.
+
+The graphs below only compare the means of the results, the standard deviations are written into the google sheet itself, and can be viewed there if needed.
+
+Instructions on how to run the benchmark are at the bottom section of this document, after the results section.
+
+## Hardware
+OSX Catalina on a 15" Macbook Pro from Mid 2015, with 16GB of RAM, and an internal Flash Drive of 256GB.
+
+## Results
+(Results are automatically updated from the baseline tab in the google spreadsheet).
+
+Detailed results below.
+
+Summary:
+* All python 3 versions (3.6/3.7/3.8) provide similar results across all scales.
+* python 3.x provides significantly better results than python2. Improvement grows as the file size grows (20% improvement for small files, up to ~70% improvement for the largest file)
+* textql seems to provide faster results than q (py3) for smaller files, up to around 30MB of data. As the size grows further, it becomes slower than q, up to 80% (74 seconds vs 41 seconds) for the largest file
+* The larger the files, textql becomes slower than q-py3 (up to 80% more time than q for the largest file)
+* octosql is significantly slower than both q and textql, even for small files with a low number of rows and columns
+
+### Data for 1M rows
+
+#### Run time durations for 1M rows and different column counts:
+|   rows  	| columns 	| File Size 	| python 2.7 	| python 3.6 	| python 3.7 	| python 3.8 	| textql 	| octosql 	|
+|:-------:	|:-------:	|:---------:	|:----------:	|:----------:	|:----------:	|:----------:	|:------:	|:-------:	|
+| 1000000 	|    1    	|    17M    	|    5.15    	|    4.24    	|    4.08    	|    3.98    	|  2.90  	|  49.95  	|
+| 1000000 	|    5    	|    37M    	|    10.68   	|    5.37    	|    5.26    	|    5.14    	|  5.88  	|  54.69  	|
+| 1000000 	|    10   	|    89M    	|    17.56   	|    7.25    	|    7.15    	|    7.01    	|  9.69  	|  65.32  	|
+| 1000000 	|    20   	|    192M   	|    30.28   	|    10.96   	|    10.78   	|    10.64   	|  17.34 	|  83.94  	|
+| 1000000 	|    50   	|    477M   	|    71.56   	|    21.98   	|    21.59   	|    21.70   	|  38.57 	|  158.26 	|
+| 1000000 	|   100   	|    986M   	|   131.86   	|    41.71   	|    40.82   	|    41.02   	|  74.62 	|  289.58 	|
+
+#### Comparison between python 3.x and python 2 run times (1M rows):
+(>100% is slower than q-py2, <100% is faster than q-py2)
+
+|   rows    | columns 	| file size 	| q-py2 runtime 	| q-py3.6 vs q-py2 runtime 	| q-py3.7 vs q-py2 runtime 	| q-py3.8 vs q-py2 runtime 	|
+|:-------:	|:-------:	|:---------:	|:-------------:	|:------------------------:	|:------------------------:	|:------------------------:	|
+| 1000000 	|    1    	|    17M    	|    100.00%    	|          82.34%          	|          79.34%          	|          77.36%          	|
+| 1000000 	|    5    	|    37M    	|    100.00%    	|          50.25%          	|          49.22%          	|          48.08%          	|
+| 1000000 	|    10   	|    89M    	|    100.00%    	|          41.30%          	|          40.69%          	|          39.93%          	|
+| 1000000 	|    20   	|    192M   	|    100.00%    	|          36.18%          	|          35.59%          	|          35.14%          	|
+| 1000000 	|    50   	|    477M   	|    100.00%    	|          30.71%          	|          30.17%          	|          30.32%          	|
+| 1000000 	|   100   	|    986M   	|    100.00%    	|          31.63%          	|          30.96%          	|          31.11%          	|
+
+#### textql and octosql comparison against q-py3 run time (1M rows):
+(>100% is slower than q-py3, <100% is faster than q-py3)
+
+|   rows  	| columns 	| file size 	| avg q-py3 runtime 	| textql vs q-py3 runtime 	| octosql vs q-py3 runtime 	|
+|:-------:	|:-------:	|:---------:	|:-----------------:	|:-----------------------:	|:------------------------:	|
+| 1000000 	|    1    	|    17M    	|      100.00%      	|          70.67%         	|         1217.76%         	|
+| 1000000 	|    5    	|    37M    	|      100.00%      	|         111.86%         	|         1040.70%         	|
+| 1000000 	|    10   	|    89M    	|      100.00%      	|         135.80%         	|          915.28%         	|
+| 1000000 	|    20   	|    192M   	|      100.00%      	|         160.67%         	|          777.92%         	|
+| 1000000 	|    50   	|    477M   	|      100.00%      	|         177.26%         	|          727.40%         	|
+| 1000000 	|   100   	|    986M   	|      100.00%      	|         181.19%         	|          703.15%         	|
+
+### Sensitivity to column count 
+Based on a the largest file size of 1,000,000 rows.
+
+![Sensitivity to column count](https://docs.google.com/spreadsheets/d/e/2PACX-1vQy9Zm4I322Tdf5uoiFFJx6Oi3Z4AMq7He3fUUtsEQVQIdTGfWgjxFD6k8PAy9wBjvFkqaG26oBgNTP/pubchart?oid=1585602598&format=image)
+
+### Sensitivity to line count (per column count)
+
+#### 1 Column Table
+![1 column table](https://docs.google.com/spreadsheets/d/e/2PACX-1vQy9Zm4I322Tdf5uoiFFJx6Oi3Z4AMq7He3fUUtsEQVQIdTGfWgjxFD6k8PAy9wBjvFkqaG26oBgNTP/pubchart?oid=1119350798&format=image)
+
+#### 5 Column Table
+![5 column table](https://docs.google.com/spreadsheets/d/e/2PACX-1vQy9Zm4I322Tdf5uoiFFJx6Oi3Z4AMq7He3fUUtsEQVQIdTGfWgjxFD6k8PAy9wBjvFkqaG26oBgNTP/pubchart?oid=599223098&format=image)
+
+#### 10 Column Table
+![10 column table](https://docs.google.com/spreadsheets/d/e/2PACX-1vQy9Zm4I322Tdf5uoiFFJx6Oi3Z4AMq7He3fUUtsEQVQIdTGfWgjxFD6k8PAy9wBjvFkqaG26oBgNTP/pubchart?oid=82695414&format=image)
+
+#### 20 Column Table
+![20 column table](https://docs.google.com/spreadsheets/d/e/2PACX-1vQy9Zm4I322Tdf5uoiFFJx6Oi3Z4AMq7He3fUUtsEQVQIdTGfWgjxFD6k8PAy9wBjvFkqaG26oBgNTP/pubchart?oid=1573199483&format=image)
+
+#### 50 Column Table
+![50 column table](https://docs.google.com/spreadsheets/d/e/2PACX-1vQy9Zm4I322Tdf5uoiFFJx6Oi3Z4AMq7He3fUUtsEQVQIdTGfWgjxFD6k8PAy9wBjvFkqaG26oBgNTP/pubchart?oid=448568670&format=image)
+
+#### 100 Column Table
+![100 column table](https://docs.google.com/spreadsheets/d/e/2PACX-1vQy9Zm4I322Tdf5uoiFFJx6Oi3Z4AMq7He3fUUtsEQVQIdTGfWgjxFD6k8PAy9wBjvFkqaG26oBgNTP/pubchart?oid=2101488258&format=image)
+
+## Running the benchmark
+Please note that the initial run generates large files, so you'd need more than 3GB of free space available. All the generated files reside in the `_benchmark_data/` folder.
+
+Part of the preparation flow will download the benchmark data as needed.
+
+### Preparations
+* Prerequisites:
+  * pyenv installed
+  * pyenv-virtualenv installed
+  * [`textql`](https://github.com/dinedal/textql#install)
+  * [`octosql`](https://github.com/cube2222/octosql#installation)
+
+Run `./prepare-benchmark-env`
+
+### Execution
+Run `./run-benchmark <benchmark-id>`.
+
+Benchmark output files will be written to `./benchmark-results/<q-executable>/<benchmark-id>/`.
+
+* `benchmark-id` is the id you wanna give the benchmark.
+* `q-executable` is the name of the q executable being used for the benchmark. If none has been provided through Q_EXECUTABLE, then the value will be the last commit hash. Note that there is no checking of whether the working tree is clean. 
+
+The summary of benchmark will be written to `./benchmark-results/<benchmark-id>/summary.benchmark-results``
+
+By default, the benchmark will use the source python files inside the project. If you wanna run it on one of the standalone binary executable, the set Q_EXECUTABLE to the full path of the q binary.
+
+For anyone helping with running the benchmark, don't use this parameter for now, just test against a clean checkout of the code using `./run-benchmark <benchmark-id>`.
+
+## Benchmark Development info
+### Running against the standalone binary
+* `./run-benchmark` can accept a second parameter with the q executable. If it gets this parameter, it will use this path for running q. This provides a way to test the standalone q binaries in the new packaging format. When this parameter does not exist, the benchmark is executed directly from the source code.
+
+### Updating the benchmark markdown document file
+The results should reside in the following [google sheet](https://docs.google.com/spreadsheets/d/1Ljr8YIJwUQ5F4wr6ATga5Aajpu1CvQp1pe52KGrLkbY/edit?usp=sharing). 
+
+add a new tab to the google sheet, and paste the content of `summary.benchmark-results` to the new sheet.
+
diff --git a/test/benchmark-config.sh b/test/benchmark-config.sh
@@ -0,0 +1,3 @@
+#!/bin/bash
+
+BENCHMARK_PYTHON_VERSIONS=(2.7.18 3.6.4 3.7.9 3.8.5)
diff --git a/...3b7418b46594ad256abd9db4a7671cb251e6a/2020-09-17-v2.0.17/octosql_v0.3.0.benchmark-results b/...3b7418b46594ad256abd9db4a7671cb251e6a/2020-09-17-v2.0.17/octosql_v0.3.0.benchmark-results
@@ -0,0 +1,48 @@
+lines	columns	octosql_v0.3.0_mean	octosql_v0.3.0_stddev
+1	1	0.582091641426	0.0235290239617
+10	1	0.596219730377	0.0320124029461
+100	1	0.575977492332	0.0199296245316
+1000	1	0.56785056591	0.00846389017466
+10000	1	1.1466334343	0.00760108698846
+100000	1	5.49565172195	0.131791932977
+1000000	1	49.9513648033	0.443430523063
+lines	columns	octosql_v0.3.0_mean	octosql_v0.3.0_stddev
+1	5	0.582160949707	0.0274409391571
+10	5	0.57046456337	0.0199413000359
+100	5	0.585747480392	0.0372543971623
+1000	5	0.572268772125	0.00384300349763
+10000	5	1.15530762672	0.0117990775856
+100000	5	6.10629923344	0.146711842919
+1000000	5	54.6851765394	0.315486399525
+lines	columns	octosql_v0.3.0_mean	octosql_v0.3.0_stddev
+1	10	0.586222410202	0.0232479065914
+10	10	0.59000480175	0.0186508192447
+100	10	0.581873703003	0.0331332482772
+1000	10	0.569027900696	0.0103675493106
+10000	10	1.40067322254	0.00583352224401
+100000	10	7.30705575943	0.0165839217599
+1000000	10	65.3242264032	0.512552576414
+lines	columns	octosql_v0.3.0_mean	octosql_v0.3.0_stddev
+1	20	0.571048212051	0.0166919396871
+10	20	0.594776701927	0.0368900941023
+100	20	0.561370825768	0.00907051791451
+1000	20	0.577527880669	0.00983965108957
+10000	20	1.90710241795	0.00757011452155
+100000	20	9.8267291069	0.127844155326
+1000000	20	83.9448960066	0.46121344046
+lines	columns	octosql_v0.3.0_mean	octosql_v0.3.0_stddev
+1	50	0.572030115128	0.0253648479103
+10	50	0.56993534565	0.0230474303306
+100	50	0.563336873055	0.00964411866903
+1000	50	0.826378440857	0.00941629472813
+10000	50	3.27872717381	0.126592845956
+100000	50	17.890055728	0.116794666005
+1000000	50	158.262442636	0.826290454446
+lines	columns	octosql_v0.3.0_mean	octosql_v0.3.0_stddev
+1	100	0.569358110428	0.0279801762531
+10	100	0.580981063843	0.0272341107532
+100	100	0.559471726418	0.00668155858429
+1000	100	1.08161640167	0.00698594638512
+10000	100	5.67823712826	0.0123398407167
+100000	100	32.2797194242	0.315508270241
+1000000	100	289.582628798	0.929455236817
diff --git a/...18b46594ad256abd9db4a7671cb251e6a/2020-09-17-v2.0.17/q-benchmark-2.7.18.benchmark-results b/...18b46594ad256abd9db4a7671cb251e6a/2020-09-17-v2.0.17/q-benchmark-2.7.18.benchmark-results
@@ -0,0 +1,48 @@
+lines	columns	q-benchmark-2.7.18_mean	q-benchmark-2.7.18_stddev
+1	1	0.106449890137	0.002010027753
+10	1	0.106737875938	0.00224112203891
+100	1	0.107839012146	0.00102954061006
+1000	1	0.113026666641	0.00147361890226
+10000	1	0.160376381874	0.00569766179806
+100000	1	0.608236479759	0.00604026519608
+1000000	1	5.14807910919	0.0584474028762
+lines	columns	q-benchmark-2.7.18_mean	q-benchmark-2.7.18_stddev
+1	5	0.106719517708	0.00236752032369
+10	5	0.107823801041	0.00238873169438
+100	5	0.109785079956	0.0013047675259
+1000	5	0.120395207405	0.00207224422629
+10000	5	0.21783041954	0.00522254475716
+100000	5	1.17115747929	0.0221394865225
+1000000	5	10.6830974817	0.339822977934
+lines	columns	q-benchmark-2.7.18_mean	q-benchmark-2.7.18_stddev
+1	10	0.104981088638	0.00166552032929
+10	10	0.108320140839	0.00204034349199
+100	10	0.112528729439	0.00168376477305
+1000	10	0.13019015789	0.00253773120965
+10000	10	0.284891676903	0.00384009140782
+100000	10	1.84725661278	0.00860738744089
+1000000	10	17.5610994339	0.228322442172
+lines	columns	q-benchmark-2.7.18_mean	q-benchmark-2.7.18_stddev
+1	20	0.106477689743	0.00254429925697
+10	20	0.108580899239	0.00173704653824
+100	20	0.118750286102	0.00247623639866
+1000	20	0.146431708336	0.00249685551944
+10000	20	0.419492387772	0.00248210434668
+100000	20	3.15847921371	0.0550301268026
+1000000	20	30.279082489	0.124978814506
+lines	columns	q-benchmark-2.7.18_mean	q-benchmark-2.7.18_stddev
+1	50	0.105411934853	0.00171651054128
+10	50	0.109102797508	0.00111620290512
+100	50	0.135682177544	0.00196166766665
+1000	50	0.198261427879	0.00396172489054
+10000	50	0.821499919891	0.0111642692132
+100000	50	7.05980975628	0.121182371277
+1000000	50	71.5645889759	5.02009516291
+lines	columns	q-benchmark-2.7.18_mean	q-benchmark-2.7.18_stddev
+1	100	0.10662381649	0.00193146624495
+10	100	0.110662698746	0.00171461379583
+100	100	0.163547992706	0.00166570196628
+1000	100	0.280023741722	0.00337543024145
+10000	100	1.46053376198	0.0221691284465
+100000	100	13.2369835854	0.309375896258
+1000000	100	131.864977288	1.22415449691
diff --git a/...418b46594ad256abd9db4a7671cb251e6a/2020-09-17-v2.0.17/q-benchmark-3.6.4.benchmark-results b/...418b46594ad256abd9db4a7671cb251e6a/2020-09-17-v2.0.17/q-benchmark-3.6.4.benchmark-results
@@ -0,0 +1,48 @@
+lines	columns	q-benchmark-3.6.4_mean	q-benchmark-3.6.4_stddev
+1	1	0.10342762470245362	0.0017673875851759295
+10	1	0.10239293575286865	0.0012505611685910795
+100	1	0.10317318439483643	0.0010581783881541751
+1000	1	0.10687050819396973	0.0014050135772919004
+10000	1	0.1447664737701416	0.001841256227287192
+100000	1	0.5162809371948243	0.006962985088492867
+1000000	1	4.238853335380554	0.04834401143632507
+lines	columns	q-benchmark-3.6.4_mean	q-benchmark-3.6.4_stddev
+1	5	0.10211825370788574	0.0022568191323651568
+10	5	0.1025341272354126	0.0016446470901070106
+100	5	0.1053577184677124	0.0015298114223855884
+1000	5	0.10980842113494874	0.002536098780902228
+10000	5	0.1590113162994385	0.003123074098301634
+100000	5	0.6348223447799682	0.0082691507829872
+1000000	5	5.368562030792236	0.11628913334105236
+lines	columns	q-benchmark-3.6.4_mean	q-benchmark-3.6.4_stddev
+1	10	0.10251858234405517	0.0015963869535345293
+10	10	0.10278875827789306	0.0009920577082124496
+100	10	0.10715732574462891	0.002033320000941064
+1000	10	0.11389360427856446	0.0023603847702423973
+10000	10	0.17806434631347656	0.001114054252191835
+100000	10	0.8252989768981933	0.0037080843359275904
+1000000	10	7.252838873863221	0.029052130546213153
+lines	columns	q-benchmark-3.6.4_mean	q-benchmark-3.6.4_stddev
+1	20	0.10367965698242188	0.003661761341842434
+10	20	0.10489590167999267	0.001977141196109372
+100	20	0.11108210086822509	0.0014801173497056886
+1000	20	0.12110791206359864	0.001648524669420912
+10000	20	0.2178968906402588	0.0019298316207276716
+100000	20	1.1962245225906372	0.010541407803235559
+1000000	20	10.956057572364807	0.12677108174061705
+lines	columns	q-benchmark-3.6.4_mean	q-benchmark-3.6.4_stddev
+1	50	0.10458300113677979	0.0016367630302744722
+10	50	0.10616152286529541	0.002345135740908088
+100	50	0.12375867366790771	0.00238414904864133
+1000	50	0.14462883472442628	0.0022428030896492978
+10000	50	0.34488487243652344	0.004867441221052092
+100000	50	2.3394312858581543	0.02263239858944125
+1000000	50	21.979821610450745	0.09080404939303836
+lines	columns	q-benchmark-3.6.4_mean	q-benchmark-3.6.4_stddev
+1	100	0.10372309684753418	0.0010299126833031144
+10	100	0.10784556865692138	0.0016557634029464607
+100	100	0.14526791572570802	0.0028194506905186724
+1000	100	0.18315494060516357	0.0023585311962114673
+10000	100	0.5586131334304809	0.004808492789681402
+100000	100	4.287398314476013	0.00957500108409644
+1000000	100	41.706851434707644	0.4161526076289425