SIMDe 0.8.2
Summary
- Start of RISCV64 optimized implementation using the RVV1.0 vector extension! Thank you @eric900115 @howjmay @zengdage
- 62 of the ARM Neon intrinsics added in SIMDe 0.8.0 had to be removed for not exactly matching the specs and real hardware
(from the FCVTZS/FCVTMS/FCVTPS/FCVTNS families). This brings us down from 100% coverage of the NEON functions to 99.07%.
For the entire project: 126 files changed, 5522 insertions(+), 2772 deletions(-)
For just the simde
folder: 89 files changed, 4330 insertions(+), 2199 deletions(-)
Details
Implementation of Arm intrinsics
NEON
- arm neon: disable some FCVTZS/FCVTMS/FCVTPS/FCVTNS family intrinsics 339ffe4 @mr-c
- arm neon sm3: check constant range 3d34fcd @mr-c
- arm 32 bits: native def fixes; workarounds for gcc 22900e6 @Cuda-Chen
- x86 implementations: allow _m128 access from SSE 114c3cd @mr-c
WASM intrinsics
x86 intrinsics
SVML
XOP
Arch support
arm / arm64
- arm platform: cleanup feature detection. 08c21f3 @mr-c
- arm: enable more intrinsic function for armv7 416091e @zengdage
RISCV64
- Initial Support for the RISC-V Vector Extension (RVV1.0) in ARM NEON (#1130) b4e805a @eric900115
- arm: fix some neon2rvv intrinsic function error 2a548e5 @zengdage
- arm: Add neon2rvv support in vand series intrinsics dac67f3 @howjmay
- arm: improve performance in vabd_xxx for risc-v b63ba04 @zengdage
- arm: improve performance in vhadd_xxx for risc-v a68fa90 @zengdage
Compiler Specific
Clang
- detect clang versions 18 & 19 ed4a5cd @mr-c
- arm neon clang: skip vrnd native before clang v18 e647f10 @mr-c
- apple clang arm64: ignore SHA2 be48ef8 @mr-c
Emscripten
MSVC
- x86 test msvc: really disable warning 4799,4730 487507d @mr-c
- sse2 MSVC
_mm_pause
implementaiton for x86 8d95f83 @mr-c - SSE is good enough for native m128i and m128d types & functions 9982b27 @mr-c
Testing with Docker/Podman & CI
Cirrus CI
GitHub Actions
- test Mac arm64 0080b28 @mr-c
- macos: report log if there is a configuration failure. df3e930 @mr-c
- build(deps): bump actions/checkout from 3 to 4 (#1149) 9605608 @dependabot[bot]
- build(deps): bump codecov/codecov-action from 3 to 4 25382c1 @dependabot[bot]
- codecov: use token 2c45dd4 @mr-c
- Add gcc arm 32bit armv8-a test in CI 72bde75 @Cuda-Chen
- build for AMD Buildozer version 2 9746537 @mr-c
Packit CI
Semaphore CI
Misc
- update list of fully implemented instruction sets (#1152) b568fcd @mr-c
- typo fixes from codespell 8639fef @mr-c
- README.md - move CLMUL to partial, list more of the CI.yml architectures 285b50d @Torinde
- Update README.md - link to VPCLMULQDQ; mention MSA (#1157) 517da84 @Torinde
- Update README.md (#1156) b88a66d @mr-c
- README: two more related projects 7429dff @mr-c
New Contributors
- @eric900115 made their first contribution in #1130
- @Cuda-Chen made their first contribution in #1116
- @Torinde made their first contribution in #1157
- @zengdage made their first contribution in #1172
- @howjmay made their first contribution in #1174
Full Changelog: v0.8.0...v0.8.2