Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add goawk to the benchmark #37

Open
ghuls opened this issue Jan 19, 2021 · 14 comments
Open

Add goawk to the benchmark #37

ghuls opened this issue Jan 19, 2021 · 14 comments

Comments

@ghuls
Copy link
Contributor

ghuls commented Jan 19, 2021

Add goawk to the benchmark: https://github.com/benhoyt/goawk

  • I/O-bound AWK scripts (which is most of them) are significantly faster than awk, and on a par with gawk and mawk.
  • CPU-bound AWK scripts are slightly slower than awk, and about twice as slow as gawk and mawk.

Also for mawk you should use 1.3.4 on Linux. The 1.3.3 version is very old and is missing a lot of bug fixes. The last Debian version finally updated to mawk 1.3.4:
https://invisible-island.net/mawk/

It would also be good to use the march flag so more advanced CPU features can be used to have a more fair comparison with frawk:

wget ftp://ftp.invisible-island.net/mawk/mawk.tar.gz
tar xzf mawk.tar.gz

cd mawk-1.3.4-20171017/

# Use  CPU optimalisations.
./configure EXTRA_CFLAGS=-march=native

make
@ezrosent
Copy link
Owner

I'll take a look at both of these. I looked into goawk a while ago and didn't see any areas where it outperformed mawk, so I omitted it (it's also my view that awk scripts generally aren't IO bound if you're reading input form a fairly recent SSD; as is the case for all of the benchmark numbers here).

As for an updated mawk, I'll see about downloading a recent copy from invisible-island and re-running the linux benchmarks.

@ghuls
Copy link
Contributor Author

ghuls commented Jan 20, 2021

I think it depends on the AWK script. In most of my AWK scripts there are not a lot of calculations (mostly some simple counters) and more string manipulations.

In the following example, frawk is relatively slow:

mawk:					1:14.19 = 74.19 seconds
gawk 4.0.2:			4:37.43 = 277.43 seconds
gawk 4.0.2 LANG=C:	1:24.76 = 84.76 seconds
gawk 5.1.0:			3:55.70 = 235.70 seconds
gawk 5.1.0 LANG=C:	1:05.72 = 65.72 seconds
goawk 1.6.1:			4:40.98 = 280.98 seconds
frawk (cranelift):	2:22.60 = 142.60 seconds
frawk (cranelift) direct write to file: 2:36.28 = 156.28 seconds

Remarks:

  • GNU utils (gawk, sort, grep) are much faster when disabling their locale support (for my usecase this is fine as I have exclusively files in ASCII).
  • gawk 5.1.0 LANG=C beating mawk was surprising to me. Seems gawk became much faster recently.
  • goawk: Only tested the released version. There is a recent commit that improves I/O (but only in combination with subprocesses): benhoyt/goawk@89aae73
  • frawk (cranelift): There was a lot of voluntary waiting for I/O (flushing stdout on each print?).
  • frawk (cranelift) direct write to file: Tried writing directly to the file instead of redirecting.
# VCF input file.
wget ftp://ftp.ensembl.org/pub/grch37/release-102/variation/vcf/homo_sapiens/1000GENOMES-phase_3.vcf.gz


# Timing function
timeit ()
{
    if [ $# -ge 1 ]; then
        local info_color=$(printf "${info_color}");
        local reset_colors=$(printf "${reset_colors}");
        command time -f "\nTime output:\n------------\n\n  * Command: ${info_color}%C${reset_colors}\n  * Elapsed wall time: ${info_color}%E${reset_colors} = ${info_color}%e seconds${reset_colors}\n  * Elapsed CPU time:\n     - User: ${info_color}%U${reset_colors}\n     - Sys: ${info_color}%S${reset_colors}\n  * CPU usage: ${info_color}%P${reset_colors}\n  * Context switching:\n     - Voluntarily (e.g.: waiting for I/O operation): ${info_color}%w${reset_colors}\n     - Involuntarily (time slice expired): ${info_color}%c${reset_colors}\n  * Maximum resident set size (RSS: memory) (kiB): ${info_color}%M${reset_colors}\n  * Number of times the process was swapped out of main memory: ${info_color}%W${reset_colors}\n  * Filesystem:\n     - # of inputs: ${info_color}%I${reset_colors}\n     - # of outputs: ${info_color}%O${reset_colors}\n  * Exit status: ${info_color}%x${reset_colors}\n" "${@}";
        return 0;
    else
        printf "${help_color}Usage:${reset_colors}   timeit <command> <args>\n\n${help_color}Purpose:${reset_colors} Get information about run time, CPU usage, memory usage, ... of a command.\n\n";
        return 1;
    fi
}


# mawk 1.3.4
❯ zcat 1000GENOMES-phase_3.vcf.gz | timeit mawk -F '\t' 'BEGIN { OFS="\t"; }
        {
            if ($1 !~ /^#/) {
                if ($1 == "MT") {
                    $1 = "chrM";
                } else {
                    $1 = "chr" $1;
                }
            } else if ( $1 ~ /^##contig=<ID=/ ) {
                sub(/^##contig=<ID=/, "##contig=<ID=chr");
                sub(/^##contig=<ID=chrMT/, "##contig=<ID=chrM");
            } else if ( $0 == "##INFO=<ID=MAF,Number=1,Type=Float,Description=\"Minor Allele Frequency\">") {
                print "##INFO=<ID=AF,Number=1,Type=Float,Description=\"Allele Frequency\">";
            }

            print $0;
        }' > /dev/null

Time output:
------------

  * Command: mawk -F \t BEGIN { OFS="\t"; }
        {
            if ($1 !~ /^#/) {
                if ($1 == "MT") {
                    $1 = "chrM";
                } else {
                    $1 = "chr" $1;
                }
            } else if ( $1 ~ /^##contig=<ID=/ ) {
                sub(/^##contig=<ID=/, "##contig=<ID=chr");
                sub(/^##contig=<ID=chrMT/, "##contig=<ID=chrM");
            } else if ( $0 == "##INFO=<ID=MAF,Number=1,Type=Float,Description=\"Minor Allele Frequency\">") {
                print "##INFO=<ID=AF,Number=1,Type=Float,Description=\"Allele Frequency\">";
            }

            print $0;
        }
  * Elapsed wall time: 1:14.19 = 74.19 seconds
  * Elapsed CPU time:
     - User: 69.42
     - Sys: 4.24
  * CPU usage: 99%
  * Context switching:
     - Voluntarily (e.g.: waiting for I/O operation): 1579
     - Involuntarily (time slice expired): 104
  * Maximum resident set size (RSS: memory) (kiB): 1116
  * Number of times the process was swapped out of main memory: 0
  * Filesystem:
     - # of inputs: 0
     - # of outputs: 0
  * Exit status: 0


# gawk 4.0.2 (system version).
❯ zcat 1000GENOMES-phase_3.vcf.gz | timeit gawk -F '\t' 'BEGIN { OFS="\t"; }
        {
            if ($1 !~ /^#/) {
                if ($1 == "MT") {
                    $1 = "chrM";
                } else {
                    $1 = "chr" $1;
                }
            } else if ( $1 ~ /^##contig=<ID=/ ) {
                sub(/^##contig=<ID=/, "##contig=<ID=chr");
                sub(/^##contig=<ID=chrMT/, "##contig=<ID=chrM");
            } else if ( $0 == "##INFO=<ID=MAF,Number=1,Type=Float,Description=\"Minor Allele Frequency\">") {
                print "##INFO=<ID=AF,Number=1,Type=Float,Description=\"Allele Frequency\">";
            }

            print $0;
        }' > /dev/null

Time output:
------------

  * Command: gawk -F \t BEGIN { OFS="\t"; }
        {
            if ($1 !~ /^#/) {
                if ($1 == "MT") {
                    $1 = "chrM";
                } else {
                    $1 = "chr" $1;
                }
            } else if ( $1 ~ /^##contig=<ID=/ ) {
                sub(/^##contig=<ID=/, "##contig=<ID=chr");
                sub(/^##contig=<ID=chrMT/, "##contig=<ID=chrM");
            } else if ( $0 == "##INFO=<ID=MAF,Number=1,Type=Float,Description=\"Minor Allele Frequency\">") {
                print "##INFO=<ID=AF,Number=1,Type=Float,Description=\"Allele Frequency\">";
            }

            print $0;
        }
  * Elapsed wall time: 4:37.43 = 277.43 seconds
  * Elapsed CPU time:
     - User: 253.50
     - Sys: 11.84
  * CPU usage: 95%
  * Context switching:
     - Voluntarily (e.g.: waiting for I/O operation): 1067
     - Involuntarily (time slice expired): 239711
  * Maximum resident set size (RSS: memory) (kiB): 1540
  * Number of times the process was swapped out of main memory: 0
  * Filesystem:
     - # of inputs: 0
     - # of outputs: 0
  * Exit status: 0


# gawk 4.0.2 (system version) with LANG=C (a lot of coreutils tools are much faster in C mode).
❯ zcat 1000GENOMES-phase_3.vcf.gz | LANG=C timeit gawk -F '\t' 'BEGIN { OFS="\t"; }
        {
            if ($1 !~ /^#/) {
                if ($1 == "MT") {
                    $1 = "chrM";
                } else {
                    $1 = "chr" $1;
                }
            } else if ( $1 ~ /^##contig=<ID=/ ) {
                sub(/^##contig=<ID=/, "##contig=<ID=chr");
                sub(/^##contig=<ID=chrMT/, "##contig=<ID=chrM");
            } else if ( $0 == "##INFO=<ID=MAF,Number=1,Type=Float,Description=\"Minor Allele Frequency\">") {
                print "##INFO=<ID=AF,Number=1,Type=Float,Description=\"Allele Frequency\">";
            }

            print $0;
        }' > /dev/null

Time output:
------------

  * Command: gawk -F \t BEGIN { OFS="\t"; }
        {
            if ($1 !~ /^#/) {
                if ($1 == "MT") {
                    $1 = "chrM";
                } else {
                    $1 = "chr" $1;
                }
            } else if ( $1 ~ /^##contig=<ID=/ ) {
                sub(/^##contig=<ID=/, "##contig=<ID=chr");
                sub(/^##contig=<ID=chrMT/, "##contig=<ID=chrM");
            } else if ( $0 == "##INFO=<ID=MAF,Number=1,Type=Float,Description=\"Minor Allele Frequency\">") {
                print "##INFO=<ID=AF,Number=1,Type=Float,Description=\"Allele Frequency\">";
            }

            print $0;
        }
  * Elapsed wall time: 1:24.76 = 84.76 seconds
  * Elapsed CPU time:
     - User: 80.12
     - Sys: 4.27
  * CPU usage: 99%
  * Context switching:
     - Voluntarily (e.g.: waiting for I/O operation): 588
     - Involuntarily (time slice expired): 73
  * Maximum resident set size (RSS: memory) (kiB): 1412
  * Number of times the process was swapped out of main memory: 0
  * Filesystem:
     - # of inputs: 0
     - # of outputs: 0
  * Exit status: 0

# gawk 5.1.0.
❯ zcat 1000GENOMES-phase_3.vcf.gz | timeit /software/gawk/gawk-5.1.0/gawk -F '\t' 'BEGIN { OFS="\t"; }
        {
            if ($1 !~ /^#/) {
                if ($1 == "MT") {
                    $1 = "chrM";
                } else {
                    $1 = "chr" $1;
                }
            } else if ( $1 ~ /^##contig=<ID=/ ) {
                sub(/^##contig=<ID=/, "##contig=<ID=chr");
                sub(/^##contig=<ID=chrMT/, "##contig=<ID=chrM");
            } else if ( $0 == "##INFO=<ID=MAF,Number=1,Type=Float,Description=\"Minor Allele Frequency\">") {
                print "##INFO=<ID=AF,Number=1,Type=Float,Description=\"Allele Frequency\">";
            }

            print $0;
        }' > /dev/null

Time output:
------------

  * Command: /software/gawk/gawk-5.1.0/gawk -F \t BEGIN { OFS="\t"; }
        {
            if ($1 !~ /^#/) {
                if ($1 == "MT") {
                    $1 = "chrM";
                } else {
                    $1 = "chr" $1;
                }
            } else if ( $1 ~ /^##contig=<ID=/ ) {
                sub(/^##contig=<ID=/, "##contig=<ID=chr");
                sub(/^##contig=<ID=chrMT/, "##contig=<ID=chrM");
            } else if ( $0 == "##INFO=<ID=MAF,Number=1,Type=Float,Description=\"Minor Allele Frequency\">") {
                print "##INFO=<ID=AF,Number=1,Type=Float,Description=\"Allele Frequency\">";
            }

            print $0;
        }
  * Elapsed wall time: 3:55.70 = 235.70 seconds
  * Elapsed CPU time:
     - User: 229.33
     - Sys: 5.20
  * CPU usage: 99%
  * Context switching:
     - Voluntarily (e.g.: waiting for I/O operation): 16
     - Involuntarily (time slice expired): 541
  * Maximum resident set size (RSS: memory) (kiB): 1776
  * Number of times the process was swapped out of main memory: 0
  * Filesystem:
     - # of inputs: 0
     - # of outputs: 0
  * Exit status: 0


# gawk 5.1.0 with LANG=C (a lot of coreutils tools are much faster in C mode).
❯ zcat 1000GENOMES-phase_3.vcf.gz | LANG=C timeit /software/gawk/gawk-5.1.0/gawk -F '\t' 'BEGIN { OFS="\t"; }
        {
            if ($1 !~ /^#/) {
                if ($1 == "MT") {
                    $1 = "chrM";
                } else {
                    $1 = "chr" $1;
                }
            } else if ( $1 ~ /^##contig=<ID=/ ) {
                sub(/^##contig=<ID=/, "##contig=<ID=chr");
                sub(/^##contig=<ID=chrMT/, "##contig=<ID=chrM");
            } else if ( $0 == "##INFO=<ID=MAF,Number=1,Type=Float,Description=\"Minor Allele Frequency\">") {
                print "##INFO=<ID=AF,Number=1,Type=Float,Description=\"Allele Frequency\">";
            }

            print $0;
        }' > /dev/null

Time output:
------------

  * Command: /software/gawk/gawk-5.1.0/gawk -F \t BEGIN { OFS="\t"; }
        {
            if ($1 !~ /^#/) {
                if ($1 == "MT") {
                    $1 = "chrM";
                } else {
                    $1 = "chr" $1;
                }

                # Change MAF with AF for popscle dsc-pileup.
                #sub(/AF=([0-9.])+;/, ";&M&" $8);
                #INFO = $8; sub(";MAF=", ";AF=", INFO); sub(/AF=([0-9.])+;/, "&M&" INFO, INFO); $8=INFO;
            } else if ( $1 ~ /^##contig=<ID=/ ) {
                sub(/^##contig=<ID=/, "##contig=<ID=chr");
                sub(/^##contig=<ID=chrMT/, "##contig=<ID=chrM");
            } else if ( $0 == "##INFO=<ID=MAF,Number=1,Type=Float,Description=\"Minor Allele Frequency\">") {
                print "##INFO=<ID=AF,Number=1,Type=Float,Description=\"Allele Frequency\">";
            }

            print $0;
        }
  * Elapsed wall time: 1:05.72 = 65.72 seconds
  * Elapsed CPU time:
     - User: 60.51
     - Sys: 4.88
  * CPU usage: 99%
  * Context switching:
     - Voluntarily (e.g.: waiting for I/O operation): 1287
     - Involuntarily (time slice expired): 212
  * Maximum resident set size (RSS: memory) (kiB): 1672
  * Number of times the process was swapped out of main memory: 0
  * Filesystem:
     - # of inputs: 0
     - # of outputs: 0
  * Exit status: 0


goawk 1.6.1
❯ zcat 1000GENOMES-phase_3.vcf.gz | timeit /software/goawk/goawk_1.6.1/goawk  -F '\t' 'BEGIN { OFS="\t"; }
        {
            if ($1 !~ /^#/) {
                if ($1 == "MT") {
                    $1 = "chrM";
                } else {
                    $1 = "chr" $1;
                }
            } else if ( $1 ~ /^##contig=<ID=/ ) {
                sub(/^##contig=<ID=/, "##contig=<ID=chr");
                sub(/^##contig=<ID=chrMT/, "##contig=<ID=chrM");
            } else if ( $0 == "##INFO=<ID=MAF,Number=1,Type=Float,Description=\"Minor Allele Frequency\">") {
                print "##INFO=<ID=AF,Number=1,Type=Float,Description=\"Allele Frequency\">";
            }

            print $0;
        }' > /dev/null

Time output:
------------

  * Command: /software/goawk/goawk_1.6.1/goawk -F \t BEGIN { OFS="\t"; }
        {
            if ($1 !~ /^#/) {
                if ($1 == "MT") {
                    $1 = "chrM";
                } else {
                    $1 = "chr" $1;
                }
            } else if ( $1 ~ /^##contig=<ID=/ ) {
                sub(/^##contig=<ID=/, "##contig=<ID=chr");
                sub(/^##contig=<ID=chrMT/, "##contig=<ID=chrM");
            } else if ( $0 == "##INFO=<ID=MAF,Number=1,Type=Float,Description=\"Minor Allele Frequency\">") {
                print "##INFO=<ID=AF,Number=1,Type=Float,Description=\"Allele Frequency\">";
            }

            print $0;
        }
  * Elapsed wall time: 4:40.98 = 280.98 seconds
  * Elapsed CPU time:
     - User: 310.72
     - Sys: 13.64
  * CPU usage: 115%
  * Context switching:
     - Voluntarily (e.g.: waiting for I/O operation): 907923
     - Involuntarily (time slice expired): 1627
  * Maximum resident set size (RSS: memory) (kiB): 12724
  * Number of times the process was swapped out of main memory: 0
  * Filesystem:
     - # of inputs: 0
     - # of outputs: 0
  * Exit status: 0


# frawk
❯ zcat 1000GENOMES-phase_3.vcf.gz | timeit frawk -b cranelift -F '\t' 'BEGIN { OFS="\t"; }
        {
            if ($1 !~ /^#/) {
                if ($1 == "MT") {
                    $1 = "chrM";
                } else {
                    $1 = "chr" $1;
                }
            } else if ( $1 ~ /^##contig=<ID=/ ) {
                sub(/^##contig=<ID=/, "##contig=<ID=chr");
                sub(/^##contig=<ID=chrMT/, "##contig=<ID=chrM");
            } else if ( $0 == "##INFO=<ID=MAF,Number=1,Type=Float,Description=\"Minor Allele Frequency\">") {
                print "##INFO=<ID=AF,Number=1,Type=Float,Description=\"Allele Frequency\">";
            }

            print $0;
        }' > /dev/null

Time output:
------------

  * Command: frawk -b cranelift -F \t BEGIN { OFS="\t"; }
        {
            if ($1 !~ /^#/) {
                if ($1 == "MT") {
                    $1 = "chrM";
                } else {
                    $1 = "chr" $1;
                }
            } else if ( $1 ~ /^##contig=<ID=/ ) {
                sub(/^##contig=<ID=/, "##contig=<ID=chr");
                sub(/^##contig=<ID=chrMT/, "##contig=<ID=chrM");
            } else if ( $0 == "##INFO=<ID=MAF,Number=1,Type=Float,Description=\"Minor Allele Frequency\">") {
                print "##INFO=<ID=AF,Number=1,Type=Float,Description=\"Allele Frequency\">";
            }

            print $0;
        }
  * Elapsed wall time: 2:22.60 = 142.60 seconds
  * Elapsed CPU time:
     - User: 184.24
     - Sys: 38.64
  * CPU usage: 156%
  * Context switching:
     - Voluntarily (e.g.: waiting for I/O operation): 2988807
     - Involuntarily (time slice expired): 812
  * Maximum resident set size (RSS: memory) (kiB): 5732
  * Number of times the process was swapped out of main memory: 0
  * Filesystem:
     - # of inputs: 0
     - # of outputs: 0
  * Exit status: 0


# frawk when writing directly to /dev/null (less waiting for I/O operations at least).
❯ zcat 1000GENOMES-phase_3.vcf.gz | timeit frawk -b cranelift --out-file /dev/null -F '\t' 'BEGIN { OFS="\t"; }
        {
            if ($1 !~ /^#/) {
                if ($1 == "MT") {
                    $1 = "chrM";
                } else {
                    $1 = "chr" $1;
                }
            } else if ( $1 ~ /^##contig=<ID=/ ) {
                sub(/^##contig=<ID=/, "##contig=<ID=chr");
                sub(/^##contig=<ID=chrMT/, "##contig=<ID=chrM");
            } else if ( $0 == "##INFO=<ID=MAF,Number=1,Type=Float,Description=\"Minor Allele Frequency\">") {
                print "##INFO=<ID=AF,Number=1,Type=Float,Description=\"Allele Frequency\">";
            }

            print $0;
        }'

Time output:
------------

  * Command: frawk -b cranelift --out-file /dev/null -F \t BEGIN { OFS="\t"; }
        {
            if ($1 !~ /^#/) {
                if ($1 == "MT") {
                    $1 = "chrM";
                } else {
                    $1 = "chr" $1;
                }
            } else if ( $1 ~ /^##contig=<ID=/ ) {
                sub(/^##contig=<ID=/, "##contig=<ID=chr");
                sub(/^##contig=<ID=chrMT/, "##contig=<ID=chrM");
            } else if ( $0 == "##INFO=<ID=MAF,Number=1,Type=Float,Description=\"Minor Allele Frequency\">") {
                print "##INFO=<ID=AF,Number=1,Type=Float,Description=\"Allele Frequency\">";
            }

            print $0;
        }
  * Elapsed wall time: 2:36.28 = 156.28 seconds
  * Elapsed CPU time:
     - User: 276.02
     - Sys: 35.54
  * CPU usage: 199%
  * Context switching:
     - Voluntarily (e.g.: waiting for I/O operation): 1832
     - Involuntarily (time slice expired): 446
  * Maximum resident set size (RSS: memory) (kiB): 5624
  * Number of times the process was swapped out of main memory: 0
  * Filesystem:
     - # of inputs: 0
     - # of outputs: 0
  * Exit status: 0

@ezrosent
Copy link
Owner

Thanks so much for the benchmark and for providing your own measurements! I'll try my best to use this as a basis for further optimizations; I'm sure it'll be very helpful.

Just to clarify; I am not shocked that there are benchmarks where other awks are faster than frawk. My initial inclination was to only include goawk if it outperformed mawk or gawk on a benchmark, as those two are more commonly used in the first place. What do you think?

A couple more notes:

  • On frawk waiting for IO: frawk does not flush on every print call (unless the output is a tty; which it isn't here), nor does it grab any global locks. I suspect you have found a bug! So that's the first thing I'll look at.
  • I think 'sub' is an area that I haven't really focused on for optimizations; I'm pretty sure there's some low-hanging fruit here.
  • I've generally avoided benchmarks that stress the regex support because it's very easy to cook up benchmarks where Rust's FA-based approach has asymptotic speedups over mawk's backtracking engine. I'd be happy to use this as a basis for a benchmark, however.
  • I have been using gawk 5 for the macos benchmarks, but I haven't been switching the locale; thanks for the tip.

@ghuls
Copy link
Contributor Author

ghuls commented Jan 21, 2021

The sub code is only executed for 24 lines (part of the header) . All other lines will hit the first if block. I had another sub command in there but frawk does not support & for rrepeating the ccaptured group, so I removed it for the benchmark.
I probably will be abletto provide more benchmarkss nnow frawk works on the machines I need it on.

@ezrosent
Copy link
Owner

ezrosent commented Jan 21, 2021

Thanks! Feel free to open a new issue or append them to this one with new workloads. I think there's probably enough to chew on here for the time being in terms of optimization work, but I'm definitely interested in any cases where frawk is slower than alternatives, particularly for large inputs like this one.

And thanks for the point on sub; that means it's probably an issue with assigning into columns $1 = .... That's not something I have spent much time on aside from getting tests to pass (I just don't happen to do this much in my scripts). I'll also take a stab at speeding up regex matching for particularly simple patterns like this one, where we're effectively just comparing the first few bytes of the input.

@ghuls
Copy link
Contributor Author

ghuls commented Jan 21, 2021

Take first half milion lines of the file and place in shared memory (gzip decompression takes around 3 seconds, so the second benchmark showed all ties instead of a win for frawk when piping the output from zcat to AWKs).

zcat 1000GENOMES-phase_3.vcf.gz | head -n 5000000 > /dev/shm/head.5000000.txt

Assigning a value to a column and rebuilding $0 seems slower in frawk (at least with this file). If frawk doesn't need to rebuild $0, it is faster.

# No modification of $0.
❯ for AWK in "${MAWK}" "${GAWK}" "${FRAWK}" ; do echo "${AWK}:"; LANG=C time ${AWK}  -F '\t' 'BEGIN { OFS="\t"; } { count += 1; out = $0 } END { print count, out }' /dev/shm/head.5000000.txt ; echo ; done

mawk:
5000000	1	198997911	rs530588549	T	C,G	.	.	dbSNP_153;TSA=SNV;E_Freq;E_1000G;E_TOPMed;MA=C;MAF=0.0002;MAC=1;AA=T;AFR=0,0;AMR=0.0014,0;EAS=0,0;EUR=0,0;SAS=0,0

0.91user 0.26system 0:01.18elapsed 99%CPU (0avgtext+0avgdata 1028maxresident)k
0inputs+0outputs (0major+313minor)pagefaults 0swaps

/software/gawk/gawk-5.1.0/gawk:
5000000	1	198997911	rs530588549	T	C,G	.	.	dbSNP_153;TSA=SNV;E_Freq;E_1000G;E_TOPMed;MA=C;MAF=0.0002;MAC=1;AA=T;AFR=0,0;AMR=0.0014,0;EAS=0,0;EUR=0,0;SAS=0,0

1.37user 0.22system 0:01.64elapsed 97%CPU (0avgtext+0avgdata 1400maxresident)k
0inputs+0outputs (0major+413minor)pagefaults 0swaps

frawk -b cranelift:
5000000	1	198997911	rs530588549	T	C,G	.	.	dbSNP_153;TSA=SNV;E_Freq;E_1000G;E_TOPMed;MA=C;MAF=0.0002;MAC=1;AA=T;AFR=0,0;AMR=0.0014,0;EAS=0,0;EUR=0,0;SAS=0,0

0.73user 0.22system 0:00.97elapsed 98%CPU (0avgtext+0avgdata 5164maxresident)k
0inputs+0outputs (0major+1356minor)pagefaults 0swaps


# Reassign small colunmn 1 to itself (and rebuild $0).
❯ for AWK in "${MAWK}" "${GAWK}" "${FRAWK}" ; do echo "${AWK}:"; LANG=C time ${AWK}  -F '\t' 'BEGIN { OFS="\t"; } { count += 1; $1 = $1; out = $0 } END { print count, out }' /dev/shm/head.5000000.txt ; echo ; done

mawk:
5000000	1	198997911	rs530588549	T	C,G	.	.	dbSNP_153;TSA=SNV;E_Freq;E_1000G;E_TOPMed;MA=C;MAF=0.0002;MAC=1;AA=T;AFR=0,0;AMR=0.0014,0;EAS=0,0;EUR=0,0;SAS=0,0

3.27user 0.25system 0:03.55elapsed 99%CPU (0avgtext+0avgdata 1032maxresident)k
0inputs+0outputs (0major+339minor)pagefaults 0swaps

/software/gawk/gawk-5.1.0/gawk:
5000000	1	198997911	rs530588549	T	C,G	.	.	dbSNP_153;TSA=SNV;E_Freq;E_1000G;E_TOPMed;MA=C;MAF=0.0002;MAC=1;AA=T;AFR=0,0;AMR=0.0014,0;EAS=0,0;EUR=0,0;SAS=0,0

2.86user 0.22system 0:03.12elapsed 99%CPU (0avgtext+0avgdata 1396maxresident)k
0inputs+0outputs (0major+451minor)pagefaults 0swaps

frawk -b cranelift:
5000000	1	198997911	rs530588549	T	C,G	.	.	dbSNP_153;TSA=SNV;E_Freq;E_1000G;E_TOPMed;MA=C;MAF=0.0002;MAC=1;AA=T;AFR=0,0;AMR=0.0014,0;EAS=0,0;EUR=0,0;SAS=0,0

5.47user 0.20system 0:05.69elapsed 99%CPU (0avgtext+0avgdata 5244maxresident)k
0inputs+0outputs (0major+1533minor)pagefaults 0swaps


# Reassign small colunmn 1 a static value.
❯ for AWK in "${MAWK}" "${GAWK}" "${FRAWK}" ; do echo "${AWK}:"; LANG=C time ${AWK}  -F '\t' 'BEGIN { OFS="\t"; } { count += 1; $1 = "9"; out = $0 } END { print count, out }' /dev/shm/head.5000000.txt ; echo ; done

mawk:
5000000	9	198997911	rs530588549	T	C,G	.	.	dbSNP_153;TSA=SNV;E_Freq;E_1000G;E_TOPMed;MA=C;MAF=0.0002;MAC=1;AA=T;AFR=0,0;AMR=0.0014,0;EAS=0,0;EUR=0,0;SAS=0,0

2.87user 0.27system 0:03.15elapsed 99%CPU (0avgtext+0avgdata 1032maxresident)k
0inputs+0outputs (0major+338minor)pagefaults 0swaps

/staging/leuven/stg_00002/lcb/ghuls/software/gawk/gawk-5.1.0/gawk:
5000000	9	198997911	rs530588549	T	C,G	.	.	dbSNP_153;TSA=SNV;E_Freq;E_1000G;E_TOPMed;MA=C;MAF=0.0002;MAC=1;AA=T;AFR=0,0;AMR=0.0014,0;EAS=0,0;EUR=0,0;SAS=0,0

2.75user 0.21system 0:02.99elapsed 98%CPU (0avgtext+0avgdata 1396maxresident)k
0inputs+0outputs (0major+452minor)pagefaults 0swaps

frawk -b cranelift:
5000000	9	198997911	rs530588549	T	C,G	.	.	dbSNP_153;TSA=SNV;E_Freq;E_1000G;E_TOPMed;MA=C;MAF=0.0002;MAC=1;AA=T;AFR=0,0;AMR=0.0014,0;EAS=0,0;EUR=0,0;SAS=0,0

5.24user 0.24system 0:05.50elapsed 99%CPU (0avgtext+0avgdata 5188maxresident)k
0inputs+0outputs (0major+1525minor)pagefaults 0swaps


# Reassign small colunmn 1 a static value.
❯ for AWK in "${MAWK}" "${GAWK}" "${FRAWK}" ; do echo "${AWK}:"; LANG=C time ${AWK}  -F '\t' 'BEGIN { OFS="\t"; } { count += 1; $1 = "9999999"; out = $0 } END { print count, out }' /dev/shm/head.5000000.txt ; echo ; done

mawk:
5000000	9999999	198997911	rs530588549	T	C,G	.	.	dbSNP_153;TSA=SNV;E_Freq;E_1000G;E_TOPMed;MA=C;MAF=0.0002;MAC=1;AA=T;AFR=0,0;AMR=0.0014,0;EAS=0,0;EUR=0,0;SAS=0,0

3.01user 0.23system 0:03.25elapsed 99%CPU (0avgtext+0avgdata 1028maxresident)k
0inputs+0outputs (0major+337minor)pagefaults 0swaps

/software/gawk/gawk-5.1.0/gawk:
5000000	9999999	198997911	rs530588549	T	C,G	.	.	dbSNP_153;TSA=SNV;E_Freq;E_1000G;E_TOPMed;MA=C;MAF=0.0002;MAC=1;AA=T;AFR=0,0;AMR=0.0014,0;EAS=0,0;EUR=0,0;SAS=0,0

2.72user 0.17system 0:02.94elapsed 98%CPU (0avgtext+0avgdata 1400maxresident)k
0inputs+0outputs (0major+453minor)pagefaults 0swaps

frawk -b cranelift:
5000000	9999999	198997911	rs530588549	T	C,G	.	.	dbSNP_153;TSA=SNV;E_Freq;E_1000G;E_TOPMed;MA=C;MAF=0.0002;MAC=1;AA=T;AFR=0,0;AMR=0.0014,0;EAS=0,0;EUR=0,0;SAS=0,0

5.52user 0.22system 0:05.79elapsed 99%CPU (0avgtext+0avgdata 5188maxresident)k
0inputs+0outputs (0major+1531minor)pagefaults 0swaps


# Reassign small colunmn 1 twice to itself.
❯ for AWK in "${MAWK}" "${GAWK}" "${FRAWK}" ; do echo "${AWK}:"; LANG=C time ${AWK}  -F '\t' 'BEGIN { OFS="\t"; } { count += 1; $1 = $1 $1; out = $0 } END { print count, out }' /dev/shm/head.5000000.txt ; echo ; done

mawk:
5000000	11	198997911	rs530588549	T	C,G	.	.	dbSNP_153;TSA=SNV;E_Freq;E_1000G;E_TOPMed;MA=C;MAF=0.0002;MAC=1;AA=T;AFR=0,0;AMR=0.0014,0;EAS=0,0;EUR=0,0;SAS=0,0

2.96user 0.24system 0:03.21elapsed 99%CPU (0avgtext+0avgdata 1028maxresident)k
0inputs+0outputs (0major+338minor)pagefaults 0swaps

/software/gawk/gawk-5.1.0/gawk:
5000000	11	198997911	rs530588549	T	C,G	.	.	dbSNP_153;TSA=SNV;E_Freq;E_1000G;E_TOPMed;MA=C;MAF=0.0002;MAC=1;AA=T;AFR=0,0;AMR=0.0014,0;EAS=0,0;EUR=0,0;SAS=0,0

3.16user 0.21system 0:03.40elapsed 99%CPU (0avgtext+0avgdata 1396maxresident)k
0inputs+0outputs (0major+450minor)pagefaults 0swaps

frawk -b cranelift:
5000000	11	198997911	rs530588549	T	C,G	.	.	dbSNP_153;TSA=SNV;E_Freq;E_1000G;E_TOPMed;MA=C;MAF=0.0002;MAC=1;AA=T;AFR=0,0;AMR=0.0014,0;EAS=0,0;EUR=0,0;SAS=0,0

5.69user 0.22system 0:05.94elapsed 99%CPU (0avgtext+0avgdata 5256maxresident)k
0inputs+0outputs (0major+1542minor)pagefaults 0swaps


# Reassign large colunmn 8 to itself.
❯ for AWK in "${MAWK}" "${GAWK}" "${FRAWK}" ; do echo "${AWK}:"; LANG=C time ${AWK}  -F '\t' 'BEGIN { OFS="\t"; } { count += 1; $8 = $8; out = $0 } END { print count, out }' /dev/shm/head.5000000.txt ; echo ; done
mawk:
5000000	1	198997911	rs530588549	T	C,G	.	.	dbSNP_153;TSA=SNV;E_Freq;E_1000G;E_TOPMed;MA=C;MAF=0.0002;MAC=1;AA=T;AFR=0,0;AMR=0.0014,0;EAS=0,0;EUR=0,0;SAS=0,0

3.08user 0.25system 0:03.34elapsed 99%CPU (0avgtext+0avgdata 1032maxresident)k
0inputs+0outputs (0major+338minor)pagefaults 0swaps

/software/gawk/gawk-5.1.0/gawk:
5000000	1	198997911	rs530588549	T	C,G	.	.	dbSNP_153;TSA=SNV;E_Freq;E_1000G;E_TOPMed;MA=C;MAF=0.0002;MAC=1;AA=T;AFR=0,0;AMR=0.0014,0;EAS=0,0;EUR=0,0;SAS=0,0

2.87user 0.23system 0:03.14elapsed 98%CPU (0avgtext+0avgdata 1392maxresident)k
0inputs+0outputs (0major+450minor)pagefaults 0swaps

frawk -b cranelift:
5000000	1	198997911	rs530588549	T	C,G	.	.	dbSNP_153;TSA=SNV;E_Freq;E_1000G;E_TOPMed;MA=C;MAF=0.0002;MAC=1;AA=T;AFR=0,0;AMR=0.0014,0;EAS=0,0;EUR=0,0;SAS=0,0

5.41user 0.22system 0:05.66elapsed 99%CPU (0avgtext+0avgdata 5232maxresident)k
0inputs+0outputs (0major+1536minor)pagefaults 0swaps


# Reassign large colunmn 8 twice to itself.
❯ for AWK in "${MAWK}" "${GAWK}" "${FRAWK}" ; do echo "${AWK}:"; LANG=C time ${AWK}  -F '\t' 'BEGIN { OFS="\t"; } { count += 1; $8 = $8 $8; out = $0 } END { print count, out }' /dev/shm/head.5000000.txt ; echo ; done

mawk:
5000000	1	198997911	rs530588549	T	C,G	.	.	dbSNP_153;TSA=SNV;E_Freq;E_1000G;E_TOPMed;MA=C;MAF=0.0002;MAC=1;AA=T;AFR=0,0;AMR=0.0014,0;EAS=0,0;EUR=0,0;SAS=0,0dbSNP_153;TSA=SNV;E_Freq;E_1000G;E_TOPMed;MA=C;MAF=0.0002;MAC=1;AA=T;AFR=0,0;AMR=0.0014,0;EAS=0,0;EUR=0,0;SAS=0,0

3.80user 0.21system 0:04.02elapsed 99%CPU (0avgtext+0avgdata 1032maxresident)k
0inputs+0outputs (0major+355minor)pagefaults 0swaps

/software/gawk/gawk-5.1.0/gawk:
5000000	1	198997911	rs530588549	T	C,G	.	.	dbSNP_153;TSA=SNV;E_Freq;E_1000G;E_TOPMed;MA=C;MAF=0.0002;MAC=1;AA=T;AFR=0,0;AMR=0.0014,0;EAS=0,0;EUR=0,0;SAS=0,0dbSNP_153;TSA=SNV;E_Freq;E_1000G;E_TOPMed;MA=C;MAF=0.0002;MAC=1;AA=T;AFR=0,0;AMR=0.0014,0;EAS=0,0;EUR=0,0;SAS=0,0

3.29user 0.19system 0:03.51elapsed 99%CPU (0avgtext+0avgdata 1396maxresident)k
0inputs+0outputs (0major+451minor)pagefaults 0swaps

frawk -b cranelift:
5000000	1	198997911	rs530588549	T	C,G	.	.	dbSNP_153;TSA=SNV;E_Freq;E_1000G;E_TOPMed;MA=C;MAF=0.0002;MAC=1;AA=T;AFR=0,0;AMR=0.0014,0;EAS=0,0;EUR=0,0;SAS=0,0dbSNP_153;TSA=SNV;E_Freq;E_1000G;E_TOPMed;MA=C;MAF=0.0002;MAC=1;AA=T;AFR=0,0;AMR=0.0014,0;EAS=0,0;EUR=0,0;SAS=0,0

5.74user 0.16system 0:05.93elapsed 99%CPU (0avgtext+0avgdata 5224maxresident)k
0inputs+0outputs (0major+1541minor)pagefaults 0swaps

At least simlple regex matching (with at least a literal in it) seems very fast (like expected when using the regex crate).

❯ for AWK in "${MAWK}" "${GAWK}" "${FRAWK}" ; do echo "${AWK}:"; LANG=C time ${AWK}  -F '\t' 'BEGIN { OFS="\t"; } { if ($3 ~ /^rs/ ) { count += 1; } } END { print count }' /dev/shm/head.5000000.txt ; echo ; done

mawk:
4999932
2.04user 0.27system 0:02.33elapsed 99%CPU (0avgtext+0avgdata 1028maxresident)k
0inputs+0outputs (0major+338minor)pagefaults 0swaps

/software/gawk/gawk-5.1.0/gawk:
4999932
1.36user 0.23system 0:01.66elapsed 96%CPU (0avgtext+0avgdata 1476maxresident)k
0inputs+0outputs (0major+433minor)pagefaults 0swaps

frawk -b cranelift:
4999932
0.87user 0.20system 0:01.07elapsed 99%CPU (0avgtext+0avgdata 5184maxresident)k
0inputs+0outputs (0major+1362minor)pagefaults 0swaps


❯ for AWK in "${MAWK}" "${GAWK}" "${FRAWK}" ; do echo "${AWK}:"; LANG=C time ${AWK}  -F '\t' 'BEGIN { OFS="\t"; } { if ($3 ~ /^rs[0-9]+$/ ) { count += 1; } } END { print count }' /dev/shm/head.5000000.txt ; echo ; done

mawk:
4999932
2.46user 0.28system 0:02.75elapsed 99%CPU (0avgtext+0avgdata 1036maxresident)k
0inputs+0outputs (0major+340minor)pagefaults 0swaps

/software/gawk/gawk-5.1.0/gawk:
4999932
1.44user 0.25system 0:01.76elapsed 96%CPU (0avgtext+0avgdata 1476maxresident)k
0inputs+0outputs (0major+433minor)pagefaults 0swaps

frawk -b cranelift:
4999932
0.95user 0.21system 0:01.18elapsed 99%CPU (0avgtext+0avgdata 5204maxresident)k
0inputs+0outputs (0major+1366minor)pagefaults 0swaps


vsc30366 in r23i27n23 in /s/l/s/lcb/jwout/Resources via 🐏 52GiB/756GiB | 1GiB/2GiB took 20s
❯ for AWK in "${MAWK}" "${GAWK}" "${FRAWK}" ; do echo "${AWK}:"; LANG=C time ${AWK}  -F '\t' 'BEGIN { OFS="\t"; } { if ($8 ~ /MAF/ ) { count += 1; } } END { print count }' /dev/shm/head.5000000.txt ; echo ; done
mawk:
4824871
2.07user 0.28system 0:02.36elapsed 99%CPU (0avgtext+0avgdata 1044maxresident)k
0inputs+0outputs (0major+341minor)pagefaults 0swaps

/software/gawk/gawk-5.1.0/gawk:
4824871
1.99user 0.23system 0:02.25elapsed 98%CPU (0avgtext+0avgdata 1476maxresident)k
0inputs+0outputs (0major+475minor)pagefaults 0swaps

frawk -b cranelift:
4824871
0.90user 0.20system 0:01.12elapsed 98%CPU (0avgtext+0avgdata 5188maxresident)k
0inputs+0outputs (0major+1363minor)pagefaults 0swaps


vsc30366 in r23i27n23 in /s/l/s/lcb/jwout/Resources via 🐏 52GiB/756GiB | 1GiB/2GiB took 31s
❯ for AWK in "${MAWK}" "${GAWK}" "${FRAWK}" ; do echo "${AWK}:"; LANG=C time ${AWK}  -F '\t' 'BEGIN { OFS="\t"; } { if ($8 ~ /MAF=[0-9.]+;/ ) { count += 1; } } END { print count }' /dev/shm/head.5000000.txt ; echo ; done

mawk:
4824871
2.46user 0.21system 0:02.67elapsed 99%CPU (0avgtext+0avgdata 1044maxresident)k
0inputs+0outputs (0major+342minor)pagefaults 0swaps

/software/gawk/gawk-5.1.0/gawk:
4824871
2.11user 0.19system 0:02.33elapsed 98%CPU (0avgtext+0avgdata 1480maxresident)k
0inputs+0outputs (0major+477minor)pagefaults 0swaps

frawk -b cranelift:
4824871
1.08user 0.20system 0:01.29elapsed 99%CPU (0avgtext+0avgdata 5180maxresident)k
0inputs+0outputs (0major+1364minor)pagefaults 0swaps


❯ for AWK in "${MAWK}" "${GAWK}" "${FRAWK}" ; do echo "${AWK}:"; LANG=C time ${AWK}  -F '\t' 'BEGIN { OFS="\t"; } { if ($1 ~ /^1$/ ) { count += 1; } } END { print count }' /dev/shm/head.5000000.txt ; echo ; done

mawk:
4999932
2.00user 0.30system 0:02.31elapsed 99%CPU (0avgtext+0avgdata 1024maxresident)k
0inputs+0outputs (0major+335minor)pagefaults 0swaps

/software/gawk/gawk-5.1.0/gawk:
4999932
1.25user 0.22system 0:01.52elapsed 97%CPU (0avgtext+0avgdata 1468maxresident)k
0inputs+0outputs (0major+431minor)pagefaults 0swaps

frawk -b cranelift:
4999932
0.90user 0.21system 0:01.12elapsed 99%CPU (0avgtext+0avgdata 5176maxresident)k
0inputs+0outputs (0major+1360minor)pagefaults 0swaps


vsc30366 in r23i27n23 in /s/l/s/lcb/jwout/Resources via 🐏 52GiB/756GiB | 1GiB/2GiB took 27s
❯ for AWK in "${MAWK}" "${GAWK}" "${FRAWK}" ; do echo "${AWK}:"; LANG=C time ${AWK}  -F '\t' 'BEGIN { OFS="\t"; } { if ($1 == "1" ) { count += 1; } } END { print count }' /dev/shm/head.5000000.txt ; echo ; done

mawk:
4999932
2.20user 0.24system 0:02.47elapsed 99%CPU (0avgtext+0avgdata 1032maxresident)k
0inputs+0outputs (0major+338minor)pagefaults 0swaps

/software/gawk/gawk-5.1.0/gawk:
4999932
1.27user 0.23system 0:01.53elapsed 98%CPU (0avgtext+0avgdata 1396maxresident)k
0inputs+0outputs (0major+413minor)pagefaults 0swaps

frawk -b cranelift:
4999932
0.80user 0.21system 0:01.01elapsed 99%CPU (0avgtext+0avgdata 5160maxresident)k
0inputs+0outputs (0major+1356minor)pagefaults 0swaps

@ghuls
Copy link
Contributor Author

ghuls commented Jan 21, 2021

With jemalloc the first benchmark is almost a second faster than without it, but still more than a second (almost 1,5 second) slower than mawk/gawk.

cargo +nightly build --release --no-default-features --features use_jemalloc,allow_avx2,unstable
❯ for AWK in "${MAWK}" "${GAWK}" "${FRAWK}" ; do echo "${AWK}:"; LANG=C time ${AWK}  -F '\t' 'BEGIN { OFS="\t"; } { count += 1; $1 = $1; out = $0 } END { print count, out }' /dev/shm/head.5000000.txt ; echo ; done

mawk:
5000000	1	198997911	rs530588549	T	C,G	.	.	dbSNP_153;TSA=SNV;E_Freq;E_1000G;E_TOPMed;MA=C;MAF=0.0002;MAC=1;AA=T;AFR=0,0;AMR=0.0014,0;EAS=0,0;EUR=0,0;SAS=0,0

3.40user 0.25system 0:03.67elapsed 99%CPU (0avgtext+0avgdata 1032maxresident)k
0inputs+0outputs (0major+339minor)pagefaults 0swaps

/software/gawk/gawk-5.1.0/gawk:
5000000	1	198997911	rs530588549	T	C,G	.	.	dbSNP_153;TSA=SNV;E_Freq;E_1000G;E_TOPMed;MA=C;MAF=0.0002;MAC=1;AA=T;AFR=0,0;AMR=0.0014,0;EAS=0,0;EUR=0,0;SAS=0,0

2.93user 0.26system 0:03.24elapsed 98%CPU (0avgtext+0avgdata 1400maxresident)k
0inputs+0outputs (0major+452minor)pagefaults 0swaps

frawk -b cranelift:
5000000	1	198997911	rs530588549	T	C,G	.	.	dbSNP_153;TSA=SNV;E_Freq;E_1000G;E_TOPMed;MA=C;MAF=0.0002;MAC=1;AA=T;AFR=0,0;AMR=0.0014,0;EAS=0,0;EUR=0,0;SAS=0,0

4.47user 0.23system 0:04.73elapsed 99%CPU (0avgtext+0avgdata 16936maxresident)k
0inputs+0outputs (0major+1448minor)pagefaults 0swaps


❯ for AWK in "${MAWK}" "${GAWK}" "${FRAWK}" ; do echo "${AWK}:"; LANG=C time ${AWK}  -F '\t' 'BEGIN { OFS="\t"; } { count += 1; $8 = $8; out = $0 } END { print count, out }' /dev/shm/head.5000000.txt ; echo ; done

mawk:
5000000	1	198997911	rs530588549	T	C,G	.	.	dbSNP_153;TSA=SNV;E_Freq;E_1000G;E_TOPMed;MA=C;MAF=0.0002;MAC=1;AA=T;AFR=0,0;AMR=0.0014,0;EAS=0,0;EUR=0,0;SAS=0,0

2.94user 0.20system 0:03.15elapsed 99%CPU (0avgtext+0avgdata 1036maxresident)k
0inputs+0outputs (0major+340minor)pagefaults 0swaps

/software/gawk/gawk-5.1.0/gawk:
5000000	1	198997911	rs530588549	T	C,G	.	.	dbSNP_153;TSA=SNV;E_Freq;E_1000G;E_TOPMed;MA=C;MAF=0.0002;MAC=1;AA=T;AFR=0,0;AMR=0.0014,0;EAS=0,0;EUR=0,0;SAS=0,0

2.70user 0.17system 0:03.01elapsed 95%CPU (0avgtext+0avgdata 1396maxresident)k
0inputs+0outputs (0major+451minor)pagefaults 0swaps

frawk -b cranelift:
5000000	1	198997911	rs530588549	T	C,G	.	.	dbSNP_153;TSA=SNV;E_Freq;E_1000G;E_TOPMed;MA=C;MAF=0.0002;MAC=1;AA=T;AFR=0,0;AMR=0.0014,0;EAS=0,0;EUR=0,0;SAS=0,0

4.42user 0.16system 0:04.60elapsed 99%CPU (0avgtext+0avgdata 16936maxresident)k
0inputs+0outputs (0major+1444minor)pagefaults 0swaps


❯ for AWK in "${MAWK}" "${GAWK}" "${FRAWK}" ; do echo "${AWK}:"; LANG=C time ${AWK}  -F '\t' 'BEGIN { OFS="\t"; } { count += 1; $1 = "9"; out = $0 } END { print count, out }' /dev/shm/head.5000000.txt ; echo ; done

mawk:
5000000	9	198997911	rs530588549	T	C,G	.	.	dbSNP_153;TSA=SNV;E_Freq;E_1000G;E_TOPMed;MA=C;MAF=0.0002;MAC=1;AA=T;AFR=0,0;AMR=0.0014,0;EAS=0,0;EUR=0,0;SAS=0,0

3.02user 0.29system 0:03.32elapsed 99%CPU (0avgtext+0avgdata 1032maxresident)k
0inputs+0outputs (0major+339minor)pagefaults 0swaps

/software/gawk/gawk-5.1.0/gawk:
5000000	9	198997911	rs530588549	T	C,G	.	.	dbSNP_153;TSA=SNV;E_Freq;E_1000G;E_TOPMed;MA=C;MAF=0.0002;MAC=1;AA=T;AFR=0,0;AMR=0.0014,0;EAS=0,0;EUR=0,0;SAS=0,0

2.73user 0.28system 0:03.07elapsed 97%CPU (0avgtext+0avgdata 1400maxresident)k
0inputs+0outputs (0major+453minor)pagefaults 0swaps

frawk -b cranelift:
5000000	9	198997911	rs530588549	T	C,G	.	.	dbSNP_153;TSA=SNV;E_Freq;E_1000G;E_TOPMed;MA=C;MAF=0.0002;MAC=1;AA=T;AFR=0,0;AMR=0.0014,0;EAS=0,0;EUR=0,0;SAS=0,0

4.66user 0.20system 0:04.88elapsed 99%CPU (0avgtext+0avgdata 16904maxresident)k
0inputs+0outputs (0major+1441minor)pagefaults 0swaps

@ghuls
Copy link
Contributor Author

ghuls commented Jan 21, 2021

It looks like gawk has an option to do the same than LANG=C gawk ..., gawk -b ... or gawk --characters-as-bytes.

@ezrosent
Copy link
Owner

okay; I think that's pretty interesting, thanks again for providing your benchmarks and measurements. That's a pretty strong signal that assigning into $1 is at least a decent portion of the total slowdown, though I suspect there's still something here that we're missing (perhaps string concatenation -- that's something that jemalloc will help more with). I'll be able to dig into this more in the next couple of days. I've got a few ideas for optimizations ranging from simple tweaks, to multi-week endeavors. I'll post periodic updates on my progress here. I'll also make sure I can produce ~comparable benchmark results to the ones in this thread before getting too far in.

@ghuls
Copy link
Contributor Author

ghuls commented Jan 22, 2021

It might be worth to use hyperfine for benchmarking whole frawk (and other versions of AWK):
https://github.com/sharkdp/hyperfine

reassign_column1 () {
    local AWK="${@}";

    ${AWK} -F '\t' 'BEGIN { OFS="\t"; } { count += 1; $1 = $1; out = $0 } END { print count, out }' /dev/shm/head.5000000.txt
}

export -f reassign_column1

# It is better to specify more runs.
hyperfine --min-runs 3 -L AWK "${FRAWK}","${MAWK}","${GAWK_locale}","${GAWK}","${GOAWK}" 'reassign_column1 {AWK}'

Hyperfine output:

Benchmark #1: reassign_column1 frawk
  Time (mean ± σ):      4.797 s ±  0.105 s    [User: 4.565 s, System: 0.220 s]
  Range (min … max):    4.691 s …  4.901 s    3 runs

Benchmark #2: reassign_column1 mawk
  Time (mean ± σ):      3.551 s ±  0.032 s    [User: 3.271 s, System: 0.270 s]
  Range (min … max):    3.516 s …  3.577 s    3 runs

Benchmark #3: reassign_column1 /software/gawk/gawk-5.1.0/gawk
  Time (mean ± σ):     13.124 s ±  0.735 s    [User: 12.826 s, System: 0.232 s]
  Range (min … max):   12.676 s … 13.972 s    3 runs

  Warning: Statistical outliers were detected. Consider re-running this benchmark on a quiet PC without any interferences from other programs. It might help to use the '--warmup' or '--prepare' options.

Benchmark #4: reassign_column1 /software/gawk/gawk-5.1.0/gawk -b
  Time (mean ± σ):      3.088 s ±  0.058 s    [User: 2.824 s, System: 0.229 s]
  Range (min … max):    3.035 s …  3.150 s    3 runs

Benchmark #5: reassign_column1 /software/goawk/goawk
  Time (mean ± σ):     14.670 s ±  0.246 s    [User: 16.541 s, System: 0.779 s]
  Range (min … max):   14.412 s … 14.901 s    3 runs

Summary
  'reassign_column1 /software/gawk/gawk-5.1.0/gawk -b' ran
    1.15 ± 0.02 times faster than 'reassign_column1 mawk'
    1.55 ± 0.04 times faster than 'reassign_column1 frawk'
    4.25 ± 0.25 times faster than 'reassign_column1 /software/gawk/gawk-5.1.0/gawk'
    4.75 ± 0.12 times faster than 'reassign_column1 /software/goawk/goawk'

Criterion could be used to benchmark functions:
https://bheisler.github.io/criterion.rs/book/getting_started.html

@ezrosent
Copy link
Owner

I spent some time working on optimizing frawk for this case today. My efforts are in the speed_up_col_mutation branch. I've definitely improved things, but whether frawk is "better" on that branch depends on a number of factors. For all these benchmarks I'm using gawk 5.1.0, mawk 1.3.4 (compiled from source with -march=native), and goawk v1.6.1 (compiled with go get about a week ago).

I'll note that while gawk is noticeably faster for me with the -b option, that mawk still seems to be faster on my machine. I was able to reproduce similar margins compared with the frawk on master as the initial numbers posted in this thread, however.

Let me know if you think I've missed something here with the benchmark workloads or setup.

References to bench.awk refer to a file containing this program:

BEGIN { FS= "\t"; OFS="\t"; }
{
	if ($1 !~ /^#/) {
		if ($1 == "MT") {
			$1 = "chrM";
		} else {
			$1 = "chr" $1;
		}
	} else if ( $1 ~ /^##contig=<ID=/ ) {
		sub(/^##contig=<ID=/, "##contig=<ID=chr");
		sub(/^##contig=<ID=chrMT/, "##contig=<ID=chrM");
	} else if ( $0 == "##INFO=<ID=MAF,Number=1,Type=Float,Description=\"Minor Allele Frequency\">") {
		print "##INFO=<ID=AF,Number=1,Type=Float,Description=\"Allele Frequency\">";
	}

	print $0;
}

Optimizing the benchmark

One thing I noticed for the benchmark is that it could be optimized substantially (for frawk, but also for gawk and mawk). When you assign into $1, but there are several other columns, printing $0 requires it to be rebuilt in terms of all the other columns:

$0 = $1 OFS $2 OFS $3 OFS ...

In our case, OFS == FS, and FS has a known size, so this rebuilding is unnecessary. We could instead do the following (which shows up in the benchmark output as bench.3.awk).

# Note: the n=$1 doesn't seem to make much of a difference vs. referencing $1
# directly, this is just the program I ran.
BEGIN { FS= "\t"; OFS="\t"; }
{
	    n=$1
            if (n !~ /^#/) {
                if (n == "MT") {
                    nn = "chrM";
                } else {
                    nn = "chr" n;
                }
		print nn, substr($0, length(nn) - 1);
		next;
            } else if ( n ~ /^##contig=<ID=/ ) {
                sub(/^##contig=<ID=/, "##contig=<ID=chr");
                sub(/^##contig=<ID=chrMT/, "##contig=<ID=chrM");
            } else if ( $0 == "##INFO=<ID=MAF,Number=1,Type=Float,Description=\"Minor Allele Frequency\">") {
                print "##INFO=<ID=AF,Number=1,Type=Float,Description=\"Allele Frequency\">";
            }

            print $0;
}

This script should produce the exact same output as the initial one (and it does seem to), but it executes noticeably faster for gawk and frawk, and slightly faster for mawk. Execution is about the same for goawk.

The effects of gzip compression

I see some utilities <100% CPU, suggesting to me that the total runtime is bound by zcat's decompression. I'll note that frawk is still a few seconds slower here in terms of wall time, which is probably an issue with input IO management and lingering slowness for some string ops. I'm looking into that. These times are with input piped in by zcat ./1000GENOMES-phase_3.vcf.gz

+ time ./goawk -fbench.awk
150.36user 8.20system 2:19.13elapsed 113%CPU (0avgtext+0avgdata 11348maxresident)k
0inputs+0outputs (0major+32336minor)pagefaults 0swaps
+ time ./mawk -fbench.awk
57.30user 3.24system 1:05.73elapsed 92%CPU (0avgtext+0avgdata 2680maxresident)k
0inputs+0outputs (0major+344minor)pagefaults 0swaps
+ time ./gawk -b -fbench.awk
67.78user 3.13system 1:13.06elapsed 97%CPU (0avgtext+0avgdata 3840maxresident)k
0inputs+0outputs (0major+1005minor)pagefaults 0swaps
+ time ./frawk -bcranelift -fbench.awk
78.73user 4.76system 1:18.82elapsed 105%CPU (0avgtext+0avgdata 31280maxresident)k
0inputs+0outputs (0major+9076minor)pagefaults 0swaps
+ time ./frawk -bllvm -fbench.awk
76.28user 4.55system 1:15.41elapsed 107%CPU (0avgtext+0avgdata 48796maxresident)k
0inputs+0outputs (0major+7425minor)pagefaults 0swaps

+ time ./goawk -fbench.3.awk
150.19user 8.47system 2:19.18elapsed 113%CPU (0avgtext+0avgdata 11396maxresident)k
0inputs+0outputs (0major+28059minor)pagefaults 0swaps
+ time ./mawk -fbench.3.awk
57.80user 3.39system 1:04.84elapsed 94%CPU (0avgtext+0avgdata 2796maxresident)k
0inputs+0outputs (0major+218minor)pagefaults 0swaps
+ time ./gawk -b -fbench.3.awk
56.54user 3.57system 1:05.70elapsed 91%CPU (0avgtext+0avgdata 3724maxresident)k
0inputs+0outputs (0major+1266minor)pagefaults 0swaps
+ time ./frawk -bcranelift -fbench.3.awk
38.69user 5.21system 1:04.87elapsed 67%CPU (0avgtext+0avgdata 31120maxresident)k
0inputs+0outputs (0major+3398minor)pagefaults 0swaps
+ time ./frawk -bllvm -fbench.3.awk
34.31user 5.13system 1:04.52elapsed 61%CPU (0avgtext+0avgdata 49188maxresident)k
0inputs+0outputs (0major+4452minor)pagefaults 0swaps

We can see that for the first benchmark, frawk is marginally slower than gawk and mawk, where the latter two are using less than 100% CPU. For the optimized benchmark, they all use about the same amount of time, but frawk is using much less CPU. The rest of its time is spent idle waiting for compression. The time for bench.awk is pretty close to the time taken for for the optimized bench.3.awk. While I think there is still opportunity here to optimize frawk, I think this benchmark suggests that decompression is the major bottleneck here. It appears that mawk is already faster than gzip decompression, for example.

Running the benchmark with zstd compression

To benchmark the CPU efficiency of the different awks, I re-compressed the data with zstd compression. Zstd at level 7 produces a smaller file than gzip on this data-set while decompressing ~3-4x faster.

zcat 1000GENOMES-phase_3.vcf.gz | zstd -7 -o 1000GENOMES-phase_3.vcf.zst

I understand if you don't have control over which compression codec is being used to generate the inputs here (e.g. you get the data from someone else, or it is generated by a utility that you cannot modify), but I don't think that awk optimizations will get you performance beyond gawk -b or mawk. As I mentioned above, at least on on my workstation they're occasionally producing output faster (albeit while writing to /dev/null) than zcat can decompress the input. Here are results for the benchmark using zstdcat 1000GENOMES-phase_3.vcf.zst as input.

+ time /home/eli/go/bin/goawk -fbench.awk
147.95user 7.59system 2:19.51elapsed 111%CPU (0avgtext+0avgdata 11344maxresident)k
0inputs+0outputs (0major+14499minor)pagefaults 0swaps
+ time ./mawk -fbench.awk
58.08user 3.85system 1:01.95elapsed 99%CPU (0avgtext+0avgdata 2652maxresident)k
0inputs+0outputs (0major+422minor)pagefaults 0swaps
+ time ./gawk -b -fbench.awk
67.77user 3.59system 1:11.37elapsed 99%CPU (0avgtext+0avgdata 3776maxresident)k
0inputs+0outputs (0major+287minor)pagefaults 0swaps
+ time ./frawk -bcranelift -fbench.awk
75.61user 5.08system 1:15.87elapsed 106%CPU (0avgtext+0avgdata 30988maxresident)k
0inputs+0outputs (0major+8195minor)pagefaults 0swaps
+ time ./frawk -bllvm -fbench.awk
71.78user 4.81system 1:11.52elapsed 107%CPU (0avgtext+0avgdata 49036maxresident)k
0inputs+0outputs (0major+4833minor)pagefaults 0swaps


+ time /home/eli/go/bin/goawk -fbench.3.awk
147.27user 8.12system 2:19.07elapsed 111%CPU (0avgtext+0avgdata 11352maxresident)k
0inputs+0outputs (0major+20516minor)pagefaults 0swaps
+ time ./mawk -fbench.3.awk
58.83user 3.83system 1:02.67elapsed 99%CPU (0avgtext+0avgdata 2740maxresident)k
0inputs+0outputs (0major+718minor)pagefaults 0swaps
+ time ./gawk -b -fbench.3.awk
54.86user 3.56system 0:58.43elapsed 99%CPU (0avgtext+0avgdata 3844maxresident)k
0inputs+0outputs (0major+1576minor)pagefaults 0swaps
+ time ./frawk -bcranelift -fbench.3.awk
37.94user 4.41system 0:38.66elapsed 109%CPU (0avgtext+0avgdata 31636maxresident)k
0inputs+0outputs (0major+4010minor)pagefaults 0swaps
+ time ./frawk -bllvm -fbench.3.awk
34.33user 4.46system 0:36.73elapsed 105%CPU (0avgtext+0avgdata 49192maxresident)k
0inputs+0outputs (0major+4721minor)pagefaults 0swaps

My read on this is that frawk is definitely still slower on the original script than mawk, though the margin is now much slimmer than it was before implementing these optimizations. As for the alternative benchmark, frawk is ahead by a substantial margin.

Scripts for running benchmarks

I only include a single iteration as they take several minutes to run, but the times here are pretty stable. I'll use multiple iterations (or a more sophisticated approach like hyperfine) before updating any benchmark docs.

gzip

Output listed above omits the redundant + zcat ./1000GENOMES-phase_3.vcf.gz, set +x, and adds newlines to separate for loop iterations.

for bench in bench.awk bench.3.awk; do
	set -x
	zcat ./1000GENOMES-phase_3.vcf.gz | time ./goawk -f$bench  > /dev/null
	zcat ./1000GENOMES-phase_3.vcf.gz | time ./mawk -f$bench  > /dev/null
	zcat ./1000GENOMES-phase_3.vcf.gz | time ./gawk -b -f$bench  > /dev/null
	zcat ./1000GENOMES-phase_3.vcf.gz | time ./frawk -bcranelift -f$bench  > /dev/null
	zcat ./1000GENOMES-phase_3.vcf.gz | time ./frawk -bllvm -f$bench  > /dev/null
	set +x
done

zstd

Output listed above omits the redundant + zstdcat ./1000GENOMES-phase_3.vcf.gz, set +x, and adds newlines to separate for loop iterations. lines.

#!/bin/bash

for bench in bench.awk bench.3.awk; do
	set -x
	zstdcat ./1000GENOMES-phase_3.vcf.zst | time ~/go/bin/goawk -f$bench  > /dev/null
	zstdcat ./1000GENOMES-phase_3.vcf.zst | time ./mawk -f$bench  > /dev/null
	zstdcat ./1000GENOMES-phase_3.vcf.zst | time ./gawk -b -f$bench  > /dev/null
	zstdcat ./1000GENOMES-phase_3.vcf.zst | time ./frawk -bcranelift -f$bench  > /dev/null
	zstdcat ./1000GENOMES-phase_3.vcf.zst | time ./frawk -bllvm -f$bench  > /dev/null
	set +x
done

@ezrosent
Copy link
Owner

I added a few more changes and now frawk does a bit better on the first benchmark. I've included gzip and zstd measurements. Times for bench.3.awk are about the same. I still have some work to do to validate some of the changes in this branch but I should be able to merge it this week.

gzip

Where you can see that frawk is now io-bound, and is using noticeably less than 100% CPU (despite frawk naturally using >100% CPU on output-heavy scripts, as writes happen in a separate thread).

+ time ./goawk -fbench.awk
150.25user 8.42system 2:19.47elapsed 113%CPU (0avgtext+0avgdata 11712maxresident)k
0inputs+0outputs (0major+20739minor)pagefaults 0swaps
+ time ./mawk -fbench.awk
57.48user 3.36system 1:06.81elapsed 91%CPU (0avgtext+0avgdata 2832maxresident)k
0inputs+0outputs (0major+417minor)pagefaults 0swaps
+ time ./gawk -b -fbench.awk
66.65user 3.20system 1:10.56elapsed 99%CPU (0avgtext+0avgdata 3832maxresident)k
0inputs+0outputs (0major+697minor)pagefaults 0swaps
+ time ./frawk -bcranelift -fbench.awk
63.40user 4.59system 1:12.57elapsed 93%CPU (0avgtext+0avgdata 31612maxresident)k
0inputs+0outputs (0major+3585minor)pagefaults 0swaps
+ time ./frawk -bllvm -fbench.awk
54.89user 4.94system 1:07.73elapsed 88%CPU (0avgtext+0avgdata 48940maxresident)k
0inputs+0outputs (0major+6186minor)pagefaults 0swaps

zstd

For the compute-bound configuration, frawk is now a bit faster than mawk.

+ time /home/eli/go/bin/goawk -fbench.awk
147.23user 8.02system 2:19.36elapsed 111%CPU (0avgtext+0avgdata 11388maxresident)k
0inputs+0outputs (0major+19853minor)pagefaults 0swaps
+ time ./mawk -fbench.awk
59.70user 3.79system 1:03.51elapsed 99%CPU (0avgtext+0avgdata 2716maxresident)k
0inputs+0outputs (0major+820minor)pagefaults 0swaps
+ time ./gawk -b -fbench.awk
68.42user 3.49system 1:11.93elapsed 99%CPU (0avgtext+0avgdata 3984maxresident)k
0inputs+0outputs (0major+1409minor)pagefaults 0swaps
+ time ./frawk -bcranelift -fbench.awk
54.53user 4.80system 0:54.81elapsed 108%CPU (0avgtext+0avgdata 30844maxresident)k
0inputs+0outputs (0major+3747minor)pagefaults 0swaps
+ time ./frawk -bllvm -fbench.awk
61.32user 4.64system 0:59.44elapsed 110%CPU (0avgtext+0avgdata 49508maxresident)k
0inputs+0outputs (0major+6304minor)pagefaults 0swaps

@ghuls
Copy link
Contributor Author

ghuls commented Jan 25, 2021

I used your speed_up_col_mutation branch to benchmark my previous benchmark (first 5 milion lines of the zcatted VCF file written to /dev/shm first). Now frawk is faster than gawk and mawk. Great work!

I also recompiled mawk, to make sure it was with -march=native but on this server (with 24MB cache) it stayed slower than gawk. As this server is running CentOS7m glibc is not very recent. So I had the idea to run mawk and gawk also with jemalloc (like frawk). It improved the results of mawk a lot.

It would be nice to have the jemalloc comparisons for mawk and gawk also in your benchmark. I assume the difference with a more recent glibc will be not so great as in mine.

The benchmarking should be done without zcat as it is the bottleneck if the AWK script does not a lot of work.

Programs and settings:

# Compile jemalloc:
git clone https://github.com/jemalloc/jemalloc jemalloc-git
cd jemalloc-git
./autogen.sh --prefix=/software/jemalloc/jemalloc_install EXTRA_CFLAGS='-march=native' EXTRA_CXXFLAGS='-march=native'
make
make install


# frawk: cargo +nightly build --release --no-default-features --features use_jemalloc,allow_avx2,unstable
FRAWK='frawk'
MAWK='/software/mawk/mawk-1.3.4-20200120/mawk'
MAWK_jemalloc='/software/jemalloc/jemalloc_install/bin/jemalloc.sh /software/mawk/mawk-1.3.4-20200120/mawk'
GAWK='/software/gawk/gawk-5.1.0/gawk -b'
GAWK_jemalloc='/software/jemalloc/jemalloc_install/bin/jemalloc.sh /software/gawk/gawk-5.1.0/gawk -b'
GAWK_locale='/software/gawk/gawk-5.1.0/gawk'
GAWK_locale_jemalloc='/software/jemalloc/jemalloc_install/bin/jemalloc.sh /software/gawk/gawk-5.1.0/gawk'
GOAWK='/software/goawk/goawk'
zcat 1000GENOMES-phase_3.vcf.gz | head -n 5000000 > /dev/shm/head.5000000.txt

reassign_column1 () {
    local AWK="${@}";

    ${AWK} -F '\t' 'BEGIN { OFS="\t"; } { count += 1; $1 = $1; out = $0 } END { print count, out }' /dev/shm/head.5000000.txt
}

export -f reassign_column1


reassign_column8 () {
    local AWK="${@}";

    ${AWK} -F '\t' 'BEGIN { OFS="\t"; } { count += 1; $8 = $8; out = $0 } END { print count, out }' /dev/shm/head.5000000.txt
}

export -f reassign_column8


reassign_column1and8_double () {
    local AWK="${@}";

    ${AWK} -F '\t' 'BEGIN { OFS="\t"; } { count += 1; $1 = $1 $1; $8 = $8 $8; out = $0 } END { print count, out }' /dev/shm/head.5000000.txt
}

export -f reassign_column1and8_double
❯ hyperfine --min-runs 3 -L AWK "${FRAWK}","${MAWK}","${MAWK_jemalloc}","${GAWK}","${GAWK_jemalloc}","${GAWK_locale}","${GAWK_locale_jemalloc}","${GOAWK}" 'reassign_column1 {AWK}'                     [95/515]
Benchmark #1: reassign_column1 frawk
  Time (mean ± σ):      2.088 s ±  0.098 s    [User: 1.827 s, System: 0.202 s]
  Range (min … max):    2.019 s …  2.200 s    3 runs

Benchmark #2: reassign_column1 /software/mawk/mawk-1.3.4-20200120/mawk
  Time (mean ± σ):      3.367 s ±  0.019 s    [User: 3.106 s, System: 0.247 s]
  Range (min … max):    3.347 s …  3.385 s    3 runs

Benchmark #3: reassign_column1 /software/jemalloc/jemalloc_install/bin/jemalloc.sh /software/mawk/mawk-1.3.4-20200120/mawk
  Time (mean ± σ):      2.824 s ±  0.140 s    [User: 2.524 s, System: 0.281 s]
  Range (min … max):    2.705 s …  2.978 s    3 runs

Benchmark #4: reassign_column1 /software/gawk/gawk-5.1.0/gawk -b
  Time (mean ± σ):      3.009 s ±  0.078 s    [User: 2.754 s, System: 0.235 s]
  Range (min … max):    2.934 s …  3.089 s    3 runs

Benchmark #5: reassign_column1 /software/jemalloc/jemalloc_install/bin/jemalloc.sh /software/gawk/gawk-5.1.0/gawk -b
  Time (mean ± σ):      2.785 s ±  0.025 s    [User: 2.546 s, System: 0.231 s]
  Range (min … max):    2.756 s …  2.802 s    3 runs

Benchmark #6: reassign_column1 /software/gawk/gawk-5.1.0/gawk
  Time (mean ± σ):     12.267 s ±  0.029 s    [User: 12.007 s, System: 0.230 s]
  Range (min … max):   12.243 s … 12.299 s    3 runs

Benchmark #7: reassign_column1 /software/jemalloc/jemalloc_install/bin/jemalloc.sh /software/gawk/gawk-5.1.0/gawk
  Time (mean ± σ):     12.503 s ±  0.540 s    [User: 12.226 s, System: 0.247 s]
  Range (min … max):   12.123 s … 13.120 s    3 runs

Benchmark #8: reassign_column1 /software/goawk/goawk
  Time (mean ± σ):     14.543 s ±  0.094 s    [User: 16.404 s, System: 0.758 s]
  Range (min … max):   14.481 s … 14.651 s    3 runs

Summary
  'reassign_column1 frawk' ran
    1.33 ± 0.06 times faster than 'reassign_column1 /software/jemalloc/jemalloc_install/bin/jemalloc.sh /software/gawk/gawk-5.1.0/gawk -b'
    1.35 ± 0.09 times faster than 'reassign_column1 /software/jemalloc/jemalloc_install/bin/jemalloc.sh /software/mawk/mawk-1.3.4-20200120
/mawk'
    1.44 ± 0.08 times faster than 'reassign_column1 /software/gawk/gawk-5.1.0/gawk -b'
    1.61 ± 0.08 times faster than 'reassign_column1 /software/mawk/mawk-1.3.4-20200120/mawk'
    5.87 ± 0.28 times faster than 'reassign_column1 /software/gawk/gawk-5.1.0/gawk'
    5.99 ± 0.38 times faster than 'reassign_column1 /software/jemalloc/jemalloc_install/bin/jemalloc.sh /software/gawk/gawk-5.1.0/gawk'
    6.96 ± 0.33 times faster than 'reassign_column1 /software/goawk/goawk'
❯ hyperfine --min-runs 3 -L AWK "${FRAWK}","${MAWK}","${MAWK_jemalloc}","${GAWK}","${GAWK_jemalloc}","${GAWK_locale}","${GAWK_locale_jemalloc}","${GOAWK}" 'reassign_column8 {AWK}'
Benchmark #1: reassign_column8 frawk
  Time (mean ± σ):      2.041 s ±  0.012 s    [User: 1.868 s, System: 0.169 s]
  Range (min … max):    2.029 s …  2.053 s    3 runs

Benchmark #2: reassign_column8 /software/mawk/mawk-1.3.4-20200120/mawk
  Time (mean ± σ):      3.166 s ±  0.036 s    [User: 2.908 s, System: 0.250 s]
  Range (min … max):    3.142 s …  3.208 s    3 runs

Benchmark #3: reassign_column8 /software/jemalloc/jemalloc_install/bin/jemalloc.sh /software/mawk/mawk-1.3.4-20200120/mawk
  Time (mean ± σ):      2.606 s ±  0.010 s    [User: 2.305 s, System: 0.285 s]
  Range (min … max):    2.595 s …  2.616 s    3 runs

Benchmark #4: reassign_column8 /software/gawk/gawk-5.1.0/gawk -b
  Time (mean ± σ):      2.935 s ±  0.034 s    [User: 2.706 s, System: 0.221 s]
  Range (min … max):    2.909 s …  2.974 s    3 runs

Benchmark #5: reassign_column8 /software/jemalloc/jemalloc_install/bin/jemalloc.sh /software/gawk/gawk-5.1.0/gawk -b
  Time (mean ± σ):      2.919 s ±  0.184 s    [User: 2.663 s, System: 0.248 s]
  Range (min … max):    2.773 s …  3.126 s    3 runs

Benchmark #6: reassign_column8 /software/gawk/gawk-5.1.0/gawk
  Time (mean ± σ):     12.476 s ±  0.211 s    [User: 12.236 s, System: 0.210 s]
  Range (min … max):   12.339 s … 12.719 s    3 runs

Benchmark #7: reassign_column8 /software/jemalloc/jemalloc_install/bin/jemalloc.sh /software/gawk/gawk-5.1.0/gawk
  Time (mean ± σ):     12.678 s ±  0.317 s    [User: 12.400 s, System: 0.248 s]
  Range (min … max):   12.367 s … 13.002 s    3 runs

Benchmark #8: reassign_column8 /software/goawk/goawk
  Time (mean ± σ):     15.023 s ±  0.075 s    [User: 16.992 s, System: 0.806 s]
  Range (min … max):   14.939 s … 15.083 s    3 runs

Summary
  'reassign_column8 frawk' ran
    1.28 ± 0.01 times faster than 'reassign_column8 /software/jemalloc/jemalloc_install/bin/jemalloc.sh /software/mawk/mawk-1.3.4-20200120
/mawk'
    1.43 ± 0.09 times faster than 'reassign_column8 /software/jemalloc/jemalloc_install/bin/jemalloc.sh /software/gawk/gawk-5.1.0/gawk -b'
    1.44 ± 0.02 times faster than 'reassign_column8 /software/gawk/gawk-5.1.0/gawk -b'
    1.55 ± 0.02 times faster than 'reassign_column8 /software/mawk/mawk-1.3.4-20200120/mawk'
    6.11 ± 0.11 times faster than 'reassign_column8 /software/gawk/gawk-5.1.0/gawk'
    6.21 ± 0.16 times faster than 'reassign_column8 /software/jemalloc/jemalloc_install/bin/jemalloc.sh /software/gawk/gawk-5.1.0/gawk'
    7.36 ± 0.06 times faster than 'reassign_column8 /software/goawk/goawk'
❯ hyperfine --min-runs 3 -L AWK "${FRAWK}","${MAWK}","${MAWK_jemalloc}","${GAWK}","${GAWK_jemalloc}","${GAWK_locale}","${GAWK_locale_jemalloc}","${GOAWK}" 'reassign_column1and8_double {AWK}'
Benchmark #1: reassign_column1and8_double frawk
  Time (mean ± σ):      3.364 s ±  0.032 s    [User: 3.160 s, System: 0.195 s]
  Range (min … max):    3.328 s …  3.389 s    3 runs

Benchmark #2: reassign_column1and8_double /software/mawk/mawk-1.3.4-20200120/mawk
  Time (mean ± σ):      4.913 s ±  0.031 s    [User: 4.682 s, System: 0.218 s]
  Range (min … max):    4.877 s …  4.936 s    3 runs

Benchmark #3: reassign_column1and8_double /software/jemalloc/jemalloc_install/bin/jemalloc.sh /software/mawk/mawk-1.3.4-20200120/mawk
  Time (mean ± σ):      3.650 s ±  0.103 s    [User: 3.370 s, System: 0.258 s]
  Range (min … max):    3.575 s …  3.767 s    3 runs

Benchmark #4: reassign_column1and8_double /software/gawk/gawk-5.1.0/gawk -b
  Time (mean ± σ):      3.904 s ±  0.005 s    [User: 3.650 s, System: 0.243 s]
  Range (min … max):    3.901 s …  3.910 s    3 runs

Benchmark #5: reassign_column1and8_double /software/jemalloc/jemalloc_install/bin/jemalloc.sh /software/gawk/gawk-5.1.0/gawk -b
  Time (mean ± σ):      3.523 s ±  0.044 s    [User: 3.277 s, System: 0.223 s]
  Range (min … max):    3.486 s …  3.571 s    3 runs

Benchmark #6: reassign_column1and8_double /software/gawk/gawk-5.1.0/gawk
  Time (mean ± σ):     13.673 s ±  0.317 s    [User: 13.411 s, System: 0.229 s]
  Range (min … max):   13.476 s … 14.038 s    3 runs

  Warning: Statistical outliers were detected. Consider re-running this benchmark on a quiet PC without any interferences from other programs. It might help to use the '--warmup' or '--prepare' options.

Benchmark #7: reassign_column1and8_double /software/jemalloc/jemalloc_install/bin/jemalloc.sh /software/gawk/gawk-5.1.0/gawk
  Time (mean ± σ):     13.384 s ±  0.260 s    [User: 13.088 s, System: 0.239 s]
  Range (min … max):   13.114 s … 13.632 s    3 runs

Benchmark #8: reassign_column1and8_double /software/goawk/goawk
  Time (mean ± σ):     20.781 s ±  0.299 s    [User: 23.888 s, System: 1.046 s]
  Range (min … max):   20.439 s … 20.994 s    3 runs

Summary
  'reassign_column1and8_double frawk' ran
    1.05 ± 0.02 times faster than 'reassign_column1and8_double /software/jemalloc/jemalloc_install/bin/jemalloc.sh /software/gawk/gawk-5.1.0/gawk -b'
    1.09 ± 0.03 times faster than 'reassign_column1and8_double /software/jemalloc/jemalloc_install/bin/jemalloc.sh /software/mawk/mawk-1.3.4-20200120/mawk'
    1.16 ± 0.01 times faster than 'reassign_column1and8_double /software/gawk/gawk-5.1.0/gawk -b'
    1.46 ± 0.02 times faster than 'reassign_column1and8_double /software/mawk/mawk-1.3.4-20200120/mawk'
    3.98 ± 0.09 times faster than 'reassign_column1and8_double /software/jemalloc/jemalloc_install/bin/jemalloc.sh /software/gawk/gawk-5.1.0/gawk'
    4.07 ± 0.10 times faster than 'reassign_column1and8_double /software/gawk/gawk-5.1.0/gawk'
    6.18 ± 0.11 times faster than 'reassign_column1and8_double /software/goawk/goawk'

@ezrosent
Copy link
Owner

Just a quick update. The changes in this branch exposed a preexisting bug in frawk that I spent a good amount of time on tracking down. I believe it is now fixed, and I should be able to have this branch merged soon.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants