Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

New kernels for arctan #633

Merged
merged 7 commits into from
Oct 3, 2023
Merged

New kernels for arctan #633

merged 7 commits into from
Oct 3, 2023

Conversation

Ka-zam
Copy link
Contributor

@Ka-zam Ka-zam commented Sep 21, 2023

New arctan kernels with improved accuracy and 2.5x speedup.

Prior:

RUN_VOLK_TESTS: volk_32f_atan_32f(131071,1987)
a_avx2_fma completed in 184.81 ms
a_avx completed in 200.609 ms
a_sse4_1 completed in 364.564 ms
u_avx2_fma completed in 182.698 ms
u_avx completed in 182.903 ms
u_sse4_1 completed in 370.45 ms
generic completed in 2222.25 ms
Best aligned arch: u_avx2_fma
Best unaligned arch: u_avx2_fma

With this PR:

RUN_VOLK_TESTS: volk_32f_atan_32f(131071,1987)
a_avx2_fma completed in 76.8184 ms
a_avx2 completed in 77.7383 ms
a_sse4_1 completed in 153.017 ms
u_avx2_fma completed in 78.2237 ms
u_avx2 completed in 78.3921 ms
u_sse4_1 completed in 152.178 ms
generic completed in 2218.09 ms
Best aligned arch: a_avx2_fma
Best unaligned arch: u_avx2_fma

- Update cpu_features to v0.9.0
- Added BUILD_EXECUTABLE for cpu_features to build list_cpu_features for
  github workflow tests
- Added ENABLE_INSTALL to install cpu_feature for old behaviour
- Renamed CpuFeature -> CpuFeatures to match new name

Signed-off-by: Ashley Brighthope <ashley.b@reddegrees.com>
The else path can not be reached since the root CMake file aborts
if CpuFeatures_FOUND if false

Signed-off-by: Ashley Brighthope <ashley.b@reddegrees.com>
Signed-off-by: Ashley Brighthope <ashley.b@reddegrees.com>
@Ka-zam Ka-zam force-pushed the arctan branch 2 times, most recently from 2fd219c to fa90d4c Compare September 21, 2023 16:12
This avoids issues in the static build since otherwise cmake will add cpu_features to the export set.

Signed-off-by: Andrej Rode <mail@andrejro.de>
@jdemel
Copy link
Contributor

jdemel commented Sep 25, 2023

Thanks for your contribution. One quick fix, the DCO check failed. Luckily it contains a tailored description on how to fix it in your case.

The other CI failures are unrelated. Let's see if they go through on the second try. Finally, the code looks good.

Copy link
Contributor

@jdemel jdemel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for your PR! I added a few comments.

cpu_features Outdated Show resolved Hide resolved
include/volk/volk_avx2_intrinsics.h Outdated Show resolved Hide resolved
include/volk/volk_avx2_intrinsics.h Outdated Show resolved Hide resolved
include/volk/volk_avx_intrinsics.h Outdated Show resolved Hide resolved
include/volk/volk_sse_intrinsics.h Outdated Show resolved Hide resolved
kernels/volk/volk_32f_atan_32f.h Outdated Show resolved Hide resolved
kernels/volk/volk_32f_atan_32f.h Outdated Show resolved Hide resolved
kernels/volk/volk_32f_atan_32f.h Outdated Show resolved Hide resolved
kernels/volk/volk_32f_atan_32f.h Outdated Show resolved Hide resolved
kernels/volk/volk_32f_atan_32f.h Outdated Show resolved Hide resolved
cpu_features Outdated Show resolved Hide resolved
Signed-off-by: Magnus Lundmark <magnuslundmark@gmail.com>
Signed-off-by: Magnus Lundmark <magnuslundmark@gmail.com>
Signed-off-by: Magnus Lundmark <magnuslundmark@gmail.com>
@Ka-zam
Copy link
Contributor Author

Ka-zam commented Sep 28, 2023

I really think we're done now! Realized I wanted to rename some functions and also remove unused polynomials.

@Ka-zam Ka-zam requested a review from jdemel September 28, 2023 16:14
@Ka-zam
Copy link
Contributor Author

Ka-zam commented Oct 2, 2023

@jdemel Can we merge? I have a few more PRs that depend on this.

Copy link
Contributor

@jdemel jdemel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM.

I was worried about the cpu_features changes but they are part of the PRs that are already merged. Thanks for rebasing your work on it.

@jdemel jdemel merged commit a26a1b8 into gnuradio:main Oct 3, 2023
32 checks passed
Alesha72003 pushed a commit to Alesha72003/volk that referenced this pull request May 15, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants