[utils] add benchmark runner for YCSB #131

KFilipek · 2021-03-03T14:06:12Z

RobsDB Ultra Benchmark Runner

Commit message

This tools allows to put multiple suites and run them one-by-one
and parse the output to easy to use form as CSV files.

Description

Idea behind this PR is to provide tools to run multiple suites with YCSB using pmemkv-java.
Original files with documentation was placed previously here.

Details

As draft this files are currently for:
run_suite.py - parses test_suite.txt, produces summary and generate testplan.sh (your file to execute and collect data)
run_workload.sh - this file is used from testplan.sh, run specific workload using YCSB, used internally by testplan.sh
parser.py - parses YCSB output and provide CSV output

This change is

lukaszstolarczuk

Reviewable status: 0 of 3 files reviewed, 6 unresolved discussions (waiting on @KFilipek)

a discussion (no related file):
pls add licenses

a discussion (no related file):
pls rename commit/PR msg - you're adding a runner, not a benchmark, per se ;)

utils/parser.py, line 39 at r1 (raw file):

                csv_file.write(x + '\n')
                print x
            csv_file.close()

add missing extra line

utils/run_suite.py, line 82 at r1 (raw file):

        #get args if exists
        args = getArgs(splittedLine)

cleanup whitespaces (here and other red marks in the review)

utils/run_workload.sh, line 27 at r1 (raw file):

OLD_PATH=$(pwd)

echo $0 $1 $2 $3 $4 $5 $6 $7 $8 $9 ${10} ${11}

just echo $@ ?

utils/run_workload.sh, line 51 at r1 (raw file):

	then
    	cd $YCSB_PATH
	    ./bin/ycsb load mongodb -s -threads $3 -p hdrhistogram.percentiles=95,99,99.9,99.99 -p recordcount=$5 -p operationcount=$6 -p readproportion=$7 -p updateproportion=$8 -p insertproportion=$9 -P ./workloads/workloada -p mongodb.url=mongodb://localhost:27017/ycsb -p mongodb.writeConcern=$JOURNALING > $OLD_PATH/results/$1/load_$3.log

url should be an env or something..

igchor

Reviewable status: 0 of 3 files reviewed, 8 unresolved discussions (waiting on @KFilipek)

utils/parser.py, line 17 at r2 (raw file):

                            if record[0] == '[READ]' or record[0] == '[INSERT]' or record[0] == '[UPDATE]' or record[0] == '[OVERALL]': #in case of READ
                                try:
                                    int(record[1])

What is this casting for? Why do we only append to trimmed_lines (and hence to parsed_results) only when there is conversion error?

utils/run_suite.py, line 70 at r2 (raw file):

# open meta file
with open("test_suite.txt", "r") as configfile:

why not use https://docs.python.org/3/library/configparser.html? You wouldn't have to implement all this parsing logic yourself

KFilipek

Reviewed 1 of 2 files at r3.
Reviewable status: 0 of 3 files reviewed, 8 unresolved discussions (waiting on @KFilipek and @lukaszstolarczuk)

utils/parser.py, line 39 at r1 (raw file):

Previously, lukaszstolarczuk (Łukasz Stolarczuk) wrote…

add missing extra line

Done.

utils/run_workload.sh, line 27 at r1 (raw file):

Previously, lukaszstolarczuk (Łukasz Stolarczuk) wrote…

just echo $@ ?

Done.

lukaszstolarczuk

Reviewable status: 0 of 3 files reviewed, 7 unresolved discussions (waiting on @igchor, @KFilipek, and @lukaszstolarczuk)

a discussion (no related file):

Previously, lukaszstolarczuk (Łukasz Stolarczuk) wrote…

pls rename commit/PR msg - you're adding a runner, not a benchmark, per se ;)

bump

a discussion (no related file):

Previously, lukaszstolarczuk (Łukasz Stolarczuk) wrote…

pls add licenses

bump

a discussion (no related file):
if you don't want to do some issues, please mark them as TODO within the code

utils/run_suite.py line 82 at r1 (raw file):

Previously, lukaszstolarczuk (Łukasz Stolarczuk) wrote…

cleanup whitespaces (here and other red marks in the review)

this is rather no effort, pls fix

This tools allows to put multiple suites and run them one-by-one and parse the output to easy to use form as CSV files.

KFilipek

Reviewed 1 of 2 files at r2, 1 of 2 files at r4, 3 of 3 files at r5.
Reviewable status: all files reviewed, 7 unresolved discussions (waiting on @igchor, @KFilipek, and @lukaszstolarczuk)

a discussion (no related file):

Previously, lukaszstolarczuk (Łukasz Stolarczuk) wrote…

bump

Done.

a discussion (no related file):

Previously, lukaszstolarczuk (Łukasz Stolarczuk) wrote…

bump

Done.

utils/parser.py line 17 at r2 (raw file):

What is this casting for? Why do we only append to trimmed_lines (and hence to parsed_results) only when there is conversion error?

It's example of YCSB output:

  Client config file: client_config
Client class name: eventualConsistency_client.EventualClient
[OVERALL], RunTime(ms), 2157
[OVERALL], Throughput(ops/sec), 46.36068613815485
[TOTAL_GCS_PS_Scavenge], Count, 1
[TOTAL_GC_TIME_PS_Scavenge], Time(ms), 15
[TOTAL_GC_TIME_%_PS_Scavenge], Time(%), 0.6954102920723227
[TOTAL_GCS_PS_MarkSweep], Count, 0
[TOTAL_GC_TIME_PS_MarkSweep], Time(ms), 0
[TOTAL_GC_TIME_%_PS_MarkSweep], Time(%), 0.0
[TOTAL_GCs], Count, 1
[TOTAL_GC_TIME], Time(ms), 15
[TOTAL_GC_TIME_%], Time(%), 0.6954102920723227
[CLEANUP], Operations, 1
[CLEANUP], AverageLatency(us), 7.0
[CLEANUP], MinLatency(us), 7
[CLEANUP], MaxLatency(us), 7
[CLEANUP], 95thPercentileLatency(us), 7
[CLEANUP], 99thPercentileLatency(us), 7
[CLEANUP], 7, 1.0
[INSERT], Operations, 100
[INSERT], AverageLatency(us), 7932.67
[INSERT], MinLatency(us), 1618
[INSERT], MaxLatency(us), 203903
[INSERT], 95thPercentileLatency(us), 9063
[INSERT], 99thPercentileLatency(us), 14543
[INSERT], 1618, 1.0
[INSERT], 2117, 1.0
[INSERT], 2439, 1.0
[INSERT], 2455, 1.0
[INSERT], 2587, 1.0

As I see it divides line into tokens like: ('[INSERT]', 'Operations', '100')
then checks the second token is a text to filter lines like this:
[INSERT], 2439, 1.0.

Would be better to check it another way, but anyway.

utils/run_suite.py line 82 at r1 (raw file):

Previously, lukaszstolarczuk (Łukasz Stolarczuk) wrote…

this is rather no effort, pls fix

Done.

utils/run_suite.py line 70 at r2 (raw file):

Previously, igchor (Igor Chorążewicz) wrote…

why not use https://docs.python.org/3/library/configparser.html? You wouldn't have to implement all this parsing logic yourself

Yes, it can be used.

utils/run_workload.sh line 51 at r1 (raw file):

Previously, lukaszstolarczuk (Łukasz Stolarczuk) wrote…

url should be an env or something..

Which URL do you mean?

lukaszstolarczuk

Reviewable status: all files reviewed, 7 unresolved discussions (waiting on @igchor and @KFilipek)

a discussion (no related file):

Previously, lukaszstolarczuk (Łukasz Stolarczuk) wrote…

if you don't want to do some issues, please mark them as TODO within the code

@KFilipek, please mark all these not done issues as TODOs

a discussion (no related file):
it'd be actually nice, to put these scripts into some directory - perhaps utils/benchmarks or utils/ycsb or something...

utils/run_suite.py line 70 at r2 (raw file):

Previously, KFilipek (Krzysztof Filipek) wrote…

Yes, it can be used.

pls mark it as TODO (it'd be a nice feature)

utils/run_suite.py line 2 at r5 (raw file):

#!/usr/bin/python2

I believe (according to the SPDX standard) we shouldn't have an empty line here

utils/run_workload.sh line 51 at r1 (raw file):

Previously, KFilipek (Krzysztof Filipek) wrote…

Which URL do you mean?

It's a very old review, I know as much as you do. I'm guessing there was some URL to mongo, perhaps, at some point...?

utils/run_workload.sh line 2 at r5 (raw file):

#!/bin/bash

.

utils/run_workload.sh line 28 at r5 (raw file):

# 14 - pmemkv: path to pool

YCSB_PATH=/home/kfilipek/Development/YCSB/ # TODO(kfilipek): remove hardcoding

heh, pls unset this or something 😄

lukaszstolarczuk reviewed Mar 3, 2021

View reviewed changes

KFilipek force-pushed the utils-ycsb_runner branch 2 times, most recently from a390706 to 1c98d97 Compare March 4, 2021 23:20

igchor reviewed Mar 5, 2021

View reviewed changes

KFilipek force-pushed the utils-ycsb_runner branch 3 times, most recently from 66d1051 to c1fe545 Compare March 9, 2021 13:29

KFilipek force-pushed the utils-ycsb_runner branch 2 times, most recently from 66d1051 to 5e42372 Compare April 6, 2021 07:05

KFilipek commented Apr 6, 2021

View reviewed changes

KFilipek marked this pull request as ready for review August 11, 2022 15:45

lukaszstolarczuk suggested changes Aug 12, 2022

View reviewed changes

KFilipek changed the title ~~[utils] add benchmark for YCSB~~ [utils] add benchmark runner for YCSB Aug 29, 2022

[utils] add benchmark runner for YCSB

2d39b68

This tools allows to put multiple suites and run them one-by-one and parse the output to easy to use form as CSV files.

KFilipek force-pushed the utils-ycsb_runner branch from 5e42372 to 2d39b68 Compare August 29, 2022 14:25

KFilipek commented Aug 29, 2022

View reviewed changes

lukaszstolarczuk suggested changes Aug 30, 2022

View reviewed changes

KFilipek mentioned this pull request Sep 1, 2022

Extend scripts for running YCSB workloads #188

Open

15 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[utils] add benchmark runner for YCSB #131

[utils] add benchmark runner for YCSB #131

KFilipek commented Mar 3, 2021 •

edited

Loading

lukaszstolarczuk left a comment

igchor left a comment

KFilipek left a comment

lukaszstolarczuk left a comment

KFilipek left a comment

lukaszstolarczuk left a comment

[utils] add benchmark runner for YCSB #131

Are you sure you want to change the base?

[utils] add benchmark runner for YCSB #131

Conversation

KFilipek commented Mar 3, 2021 • edited Loading

RobsDB Ultra Benchmark Runner

Commit message

Description

Details

lukaszstolarczuk left a comment

Choose a reason for hiding this comment

igchor left a comment

Choose a reason for hiding this comment

KFilipek left a comment

Choose a reason for hiding this comment

lukaszstolarczuk left a comment

Choose a reason for hiding this comment

KFilipek left a comment

Choose a reason for hiding this comment

lukaszstolarczuk left a comment

Choose a reason for hiding this comment

KFilipek commented Mar 3, 2021 •

edited

Loading