Skip to content

JuliaSIMD/VectorizedStatistics.jl

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

VectorizedStatistics

Dev Build Status codecov.io

Fast, LoopVectorization.jl-based summary statistics.

Implemented by reduction, recursively (singlethreaded only)

  • vminimum
  • vmaximum
  • vextrema

Implemented directly by compile-time loop generation or manually-coded loops (optionally multithreaded)

  • vmean
  • vsum
  • vvar
  • vstd
  • vcov
  • vcor

Implemented via quicksort/quickselect (some easy steps vectorized), with multidimensional reductions handled by compile-time loop generation

  • vsort!
  • vmedian!
  • vquantile!
  • vpercentile!

See also

  • NaNStatistics.jl for equivalently-vectorized functions that additionally ignore NaNs

Examples and benchmarks

As of Julia v1.8.3, VectorizedStatistics v0.5.0

vminimum/vmaximum (implemented by recursive vreduce)
julia> using Statistics, VectorizedStatistics, BenchmarkTools

julia> A = rand(10_000);

julia> minimum(A) == vminimum(A)
true

julia> @benchmark minimum($A)
BenchmarkTools.Trial: 10000 samples with 5 evaluations.
 Range (min  max):  6.400 μs   17.850 μs  ┊ GC (min  max): 0.00%  0.00%
 Time  (median):     6.692 μs               ┊ GC (median):    0.00%
 Time  (mean ± σ):   6.677 μs ± 426.730 ns  ┊ GC (mean ± σ):  0.00% ± 0.00%

  ▃▁▅▇▂    ▇█▅                                                ▂
  █████▆▆▇▇█████▆▆▄▅▄▃▅▄▅▃▃▁▄▁▅▅▄▁▄▃▄▄▃▁▄▅▄▄▄▄▃▁▄▁▄▃▄▄▄▃▄▄▄▄▃ █
  6.4 μs       Histogram: log(frequency) by time      8.13 μs <

 Memory estimate: 0 bytes, allocs estimate: 0.

julia> @benchmark vminimum($A)

BenchmarkTools.Trial: 10000 samples with 190 evaluations.
 Range (min  max):  532.237 ns  760.084 ns  ┊ GC (min  max): 0.00%  0.00%
 Time  (median):     555.921 ns               ┊ GC (median):    0.00%
 Time  (mean ± σ):   551.762 ns ±  14.327 ns  ┊ GC (mean ± σ):  0.00% ± 0.00%

  ▅▃   ▆▅           ▅█▅▂                                        ▂
  ████▇███▆▅▅▆▆▆▆▇▆▅████▇▇▇▇▆▆▄▄▇▆▅▅▅▅▆▆▅▄▅▆▄▄▅▅▄▅▃▄▄▄▃▁▁▃▃▃▃▄▃ █
  532 ns        Histogram: log(frequency) by time        608 ns <

 Memory estimate: 0 bytes, allocs estimate: 0.

julia> A = rand(11, 12, 13, 14);

julia> minimum(A, dims=(1,3,4)) == vminimum(A, dims=(1,3,4))
true

julia> @benchmark minimum($A, dims=(1,3,4))
BenchmarkTools.Trial: 10000 samples with 1 evaluation.
 Range (min  max):  45.083 μs  445.208 μs  ┊ GC (min  max): 0.00%  0.00%
 Time  (median):     47.166 μs               ┊ GC (median):    0.00%
 Time  (mean ± σ):   47.126 μs ±   5.362 μs  ┊ GC (mean ± σ):  0.00% ± 0.00%

  ▄▄▄▅▂    ▅█▅▂                                                ▂
  ██████▆▇▇█████▇▇▇▆▅▅▅▅▅▅▅▅▄▄▅▅▄▅▆▅▅▅▄▅▄▅▃▄▁▁▄▅▃▄▄▃▃▃▃▃▄▄▄▃▁▃ █
  45.1 μs       Histogram: log(frequency) by time      57.2 μs <

 Memory estimate: 816 bytes, allocs estimate: 18.

julia> @benchmark vminimum($A, dims=(1,3,4))
BenchmarkTools.Trial: 10000 samples with 7 evaluations.
 Range (min  max):  4.673 μs  569.113 μs  ┊ GC (min  max):  0.00%  98.82%
 Time  (median):     5.833 μs               ┊ GC (median):     0.00%
 Time  (mean ± σ):   6.639 μs ±  19.905 μs  ┊ GC (mean ± σ):  11.21% ±  3.70%

             ▁▂▄▇██▅▂
  ▆▁▁▁▁▁▁▁▁▂▄████████▇▅▄▂▂▂▂▂▂▂▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁ ▂
  4.67 μs         Histogram: frequency by time        9.04 μs <

 Memory estimate: 18.89 KiB, allocs estimate: 7.
vmean, vstd, vvar, etc. (implemented by direct loop generation)
julia> A = rand(11, 12, 13, 14);

julia> mean(A, dims=(1,3,4))  vmean(A, dims=(1,3,4))
true

julia> @benchmark mean($A, dims=(1,3,4))
BenchmarkTools.Trial: 10000 samples with 5 evaluations.
 Range (min  max):  6.350 μs   13.800 μs  ┊ GC (min  max): 0.00%  0.00%
 Time  (median):     6.417 μs               ┊ GC (median):    0.00%
 Time  (mean ± σ):   6.461 μs ± 224.303 ns  ┊ GC (mean ± σ):  0.00% ± 0.00%

  ▃▆█▇█▆▇▅▆▄▅▃▄▂▃▁▂▁▂▂▂▃▂▃▁▂▁▂▁▂ ▁ ▁                          ▃
  ████████████████████████████████▇█▆█▇▇▅▇▆▆▆▄▅▃▆▅▅▁▆▁▄▃▁▁▅▃▅ █
  6.35 μs      Histogram: log(frequency) by time      7.08 μs <

 Memory estimate: 976 bytes, allocs estimate: 14.

julia> @benchmark vmean($A, dims=(1,3,4))
BenchmarkTools.Trial: 10000 samples with 7 evaluations.
 Range (min  max):  5.012 μs   7.696 μs  ┊ GC (min  max): 0.00%  0.00%
 Time  (median):     5.137 μs              ┊ GC (median):    0.00%
 Time  (mean ± σ):   5.147 μs ± 75.912 ns  ┊ GC (mean ± σ):  0.00% ± 0.00%

                 ▁▁▂▁█
  ▂▂▁▂▂▂▂▂▃▃▃▄█████████▇▆▅▄▅▃▃▂▂▂▂▂▂▂▂▂▂▂▂▂▂▂▂▂▂▂▂▂▂▂▂▂▂▁▂▂▂ ▃
  5.01 μs        Histogram: frequency by time        5.41 μs <

 Memory estimate: 272 bytes, allocs estimate: 4.

julia> A = rand(10_000);

julia> @benchmark mean($A)
BenchmarkTools.Trial: 10000 samples with 10 evaluations.
 Range (min  max):  1.733 μs   5.954 μs  ┊ GC (min  max): 0.00%  0.00%
 Time  (median):     1.750 μs              ┊ GC (median):    0.00%
 Time  (mean ± σ):   1.754 μs ± 93.796 ns  ┊ GC (mean ± σ):  0.00% ± 0.00%

            ▅    █     █    ▅    ▃    ▂     ▁    ▂    ▁      ▂
  ▃▁▁▁▁▆▁▁▁▁█▁▁▁▁█▁▁▁▁▁█▁▁▁▁█▁▁▁▁█▁▁▁▁█▁▁▁▁▁█▁▁▁▁█▁▁▁▁█▁▁▁▁█ █
  1.73 μs      Histogram: log(frequency) by time     1.78 μs <

 Memory estimate: 0 bytes, allocs estimate: 0.

julia> @benchmark vmean($A)
BenchmarkTools.Trial: 10000 samples with 169 evaluations.
 Range (min  max):  636.834 ns  887.331 ns  ┊ GC (min  max): 0.00%  0.00%
 Time  (median):     638.562 ns               ┊ GC (median):    0.00%
 Time  (mean ± σ):   639.624 ns ±   9.350 ns  ┊ GC (mean ± σ):  0.00% ± 0.00%

    ▄▂█▂
  ▂▃████▄▅▃▂▂▂▂▂▂▂▂▂▂▂▂▂▂▂▂▂▂▂▂▂▂▂▂▂▂▂▁▂▂▂▁▂▂▂▂▂▂▂▂▂▁▂▂▁▂▁▂▂▂▁▂ ▂
  637 ns           Histogram: frequency by time          662 ns <

 Memory estimate: 0 bytes, allocs estimate: 0.

julia> @benchmark std($A)
BenchmarkTools.Trial: 10000 samples with 7 evaluations.
 Range (min  max):  4.179 μs   24.470 μs  ┊ GC (min  max): 0.00%  0.00%
 Time  (median):     4.202 μs               ┊ GC (median):    0.00%
 Time  (mean ± σ):   4.219 μs ± 275.224 ns  ┊ GC (mean ± σ):  0.00% ± 0.00%

  ▂█▇▆▂▃▂▂                                                    ▂
  ████████▇▁▄▄▄▃▄▄▃▃▃▃▁▄▄▅▃▃▁▃▁▃▃▃▃▃▃▁▁▃▃▁▃▁▃▁▁▃▄▃▃▁▁▁▁▄▁▃▁▁▄ █
  4.18 μs      Histogram: log(frequency) by time      4.73 μs <

 Memory estimate: 0 bytes, allocs estimate: 0.

julia> @benchmark vstd($A)
BenchmarkTools.Trial: 10000 samples with 10 evaluations.
 Range (min  max):  1.421 μs   4.858 μs  ┊ GC (min  max): 0.00%  0.00%
 Time  (median):     1.475 μs              ┊ GC (median):    0.00%
 Time  (mean ± σ):   1.466 μs ± 94.269 ns  ┊ GC (mean ± σ):  0.00% ± 0.00%

    ▄ ▇ ▇ ▄  ▃ ▂ ▁          ▅  █ █ ▅ ▄ ▃  ▂ ▁                ▂
  ▆▁█▁█▁█▁█▁▁█▁█▁█▁█▁▁▅▁▆▁█▁█▁▁█▁█▁█▁█▁█▁▁█▁█▁█▁█▁▁█▁█▁▆▁▆▁▃ █
  1.42 μs      Histogram: log(frequency) by time     1.53 μs <

 Memory estimate: 0 bytes, allocs estimate: 0.
Sorting-based functions
julia> A = rand(10_000);

julia> sort(A) == vsort!(A)
true

julia> median(A) == vmedian!(A)
true

TODO

  • Median and percentile could be made more efficient with better SIMD sorting
  • Other various summary statistics (mad, aad, etc.?)
  • multithreaded vminimum, vmaximum, vextrema