Skip to content

v1.9.5

Compare
Choose a tag to compare
@klauspost klauspost released this 04 May 07:49
· 127 commits to master since this release
de70cc1
  • Made non-assembly up to 40% faster.
  • AVX512 can use multiple goroutines for lower latency + higher individual throughput.
  • AVX512 5-9% faster.
  • All code faster with user defined goroutines and high concurrency. Up to 8x faster due to less cache evictions.
  • CPUID detects AMD CPUs with hyperthreading/multiple threads/core.
  • CPUID detects AMD per CCX L3 cache size.
  • Use L1 cache size to set minimum split size.
  • Tests/benchmarks can disable specific assembly types.