Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Flush denormal float values in audio graph processing #377

Merged
merged 3 commits into from
Oct 20, 2023

Conversation

orottier
Copy link
Owner

Fixes #194

@orottier orottier requested a review from b-ma October 20, 2023 09:19
@orottier
Copy link
Owner Author

/bench

@github-actions
Copy link

Benchmark result:


bench_ctor
  Instructions:             4795471 (+4.920863%)
  L1 Accesses:              7222862 (+5.532242%)
  L2 Accesses:                54322 (-0.009204%)
  RAM Accesses:               61618 (+0.009738%)
  Estimated Cycles:         9651102 (+4.085555%)

bench_sine
  Instructions:            72035689 (+1.281078%)
  L1 Accesses:            105374191 (+1.483211%)
  L2 Accesses:               269512 (+0.391865%)
  RAM Accesses:               62494 (+0.014403%)
  Estimated Cycles:       108909041 (+1.439647%)

bench_sine_gain
  Instructions:            77632415 (+1.804653%)
  L1 Accesses:            113919276 (+2.076567%)
  L2 Accesses:               279762 (+1.319359%)
  RAM Accesses:               62598 (+0.017576%)
  Estimated Cycles:       117509016 (+2.028329%)

bench_sine_gain_delay
  Instructions:           152634344 (+1.384911%)
  L1 Accesses:            215212721 (+1.645290%)
  L2 Accesses:               514532 (+3.810381%)
  RAM Accesses:               64194 (+0.014022%)
  Estimated Cycles:       220032171 (+1.653149%)

bench_buffer_src
  Instructions:            18593353 (+5.151530%)
  L1 Accesses:             27291305 (+5.983539%)
  L2 Accesses:                86261 (+0.100959%)
  RAM Accesses:              100765 (+0.004962%)
  Estimated Cycles:        31249385 (+5.188511%)

bench_buffer_src_delay
  Instructions:            92269712 (+1.786648%)
  L1 Accesses:            126727894 (+2.195204%)
  L2 Accesses:               167390 (-0.072234%)
  RAM Accesses:              100902 (+0.003964%)
  Estimated Cycles:       131096414 (+2.120132%)

bench_buffer_src_iir
  Instructions:            43021603 (+2.757972%)
  L1 Accesses:             62937434 (+3.182580%)
  L2 Accesses:                87678 (+1.019667%)
  RAM Accesses:              100853 (+0.004958%)
  Estimated Cycles:        66905679 (+2.995470%)

bench_buffer_src_biquad
  Instructions:            39985904 (+5.375069%)
  L1 Accesses:             56903975 (+6.464470%)
  L2 Accesses:               132545 (+1.458206%)
  RAM Accesses:              100983 (+0.005942%)
  Estimated Cycles:        61101105 (+6.011702%)

bench_stereo_positional
  Instructions:            50518528 (+10.29256%)
  L1 Accesses:             76735585 (+11.76986%)
  L2 Accesses:               299341 (-10.52081%)
  RAM Accesses:              101052 (+0.000990%)
  Estimated Cycles:        81769110 (+10.70157%)

bench_stereo_panning_automation
  Instructions:            33727523 (+4.252929%)
  L1 Accesses:             50737537 (+4.786690%)
  L2 Accesses:               139418 (+2.301862%)
  RAM Accesses:              100878 (+0.008922%)
  Estimated Cycles:        54965357 (+4.434037%)

bench_analyser_node
  Instructions:            41000478 (+2.897871%)
  L1 Accesses:             57793892 (+3.476568%)
  L2 Accesses:               180274 (+0.221264%)
  RAM Accesses:              101422 (+0.004930%)
  Estimated Cycles:        62245032 (+3.223659%)

bench_hrtf_panners
  Instructions:          1790148506 (+0.381517%)
  L1 Accesses:           2571273702 (+0.440479%)
  L2 Accesses:             26765303 (+0.483366%)
  RAM Accesses:              175432 (-0.027923%)
  Estimated Cycles:      2711240337 (+0.441529%)


by setting it on the Graph processing level instead of AudioNode
@orottier
Copy link
Owner Author

/bench

@github-actions
Copy link

Benchmark result:


bench_ctor
  Instructions:             4638020 (+1.476856%)
  L1 Accesses:              6971637 (+1.862730%)
  L2 Accesses:                54335 (+0.007362%)
  RAM Accesses:               61620 (+0.012984%)
  Estimated Cycles:         9400012 (+1.378181%)

bench_sine
  Instructions:            71191954 (+0.094904%)
  L1 Accesses:            103943412 (+0.104832%)
  L2 Accesses:               286511 (+6.956577%)
  RAM Accesses:               62500 (+0.020804%)
  Estimated Cycles:       107563467 (+0.188599%)

bench_sine_gain
  Instructions:            76323650 (+0.082542%)
  L1 Accesses:            111708318 (+0.090542%)
  L2 Accesses:               296890 (+7.321534%)
  RAM Accesses:               62600 (-0.183369%)
  Estimated Cycles:       115383768 (+0.172154%)

bench_sine_gain_delay
  Instructions:           150577074 (+0.044848%)
  L1 Accesses:            211761349 (+0.046129%)
  L2 Accesses:               523127 (+6.051797%)
  RAM Accesses:               64323 (+0.015549%)
  Estimated Cycles:       216628289 (+0.114265%)

bench_buffer_src
  Instructions:            17749881 (+0.402506%)
  L1 Accesses:             25876955 (+0.509950%)
  L2 Accesses:                87106 (+1.230723%)
  RAM Accesses:              100771 (+0.125193%)
  Estimated Cycles:        29839470 (+0.474755%)

bench_buffer_src_delay
  Instructions:            90669100 (+0.074489%)
  L1 Accesses:            124049294 (+0.099005%)
  L2 Accesses:               171810 (+2.863609%)
  RAM Accesses:              100788 (+0.010915%)
  Estimated Cycles:       128435924 (+0.114580%)

bench_buffer_src_iir
  Instructions:            41930873 (+0.161201%)
  L1 Accesses:             61118185 (+0.207437%)
  L2 Accesses:                87598 (+1.096390%)
  RAM Accesses:              100738 (+0.011914%)
  Estimated Cycles:        65082005 (+0.202753%)

bench_buffer_src_biquad
  Instructions:            38009922 (+0.168223%)
  L1 Accesses:             53564104 (+0.216911%)
  L2 Accesses:               137670 (+5.116478%)
  RAM Accesses:              100862 (-0.116855%)
  Estimated Cycles:        57782624 (+0.252111%)

bench_stereo_positional
  Instructions:            45867401 (+0.147363%)
  L1 Accesses:             68793976 (+0.203981%)
  L2 Accesses:               317257 (-3.806131%)
  RAM Accesses:              100936 (+0.004954%)
  Estimated Cycles:        73913021 (+0.104897%)

bench_stereo_panning_automation
  Instructions:            32419017 (+0.220015%)
  L1 Accesses:             48541750 (+0.262203%)
  L2 Accesses:               141646 (+3.970287%)
  RAM Accesses:              100887 (+0.132999%)
  Estimated Cycles:        52781025 (+0.301545%)

bench_analyser_node
  Instructions:            39909714 (+0.160192%)
  L1 Accesses:             55974760 (+0.218652%)
  L2 Accesses:               180056 (+0.282374%)
  RAM Accesses:              101301 (-0.116349%)
  Estimated Cycles:        60420575 (+0.199880%)

bench_hrtf_panners
  Instructions:          1783410637 (+0.003749%)
  L1 Accesses:           2560057839 (+0.000536%)
  L2 Accesses:             26701193 (+0.424253%)
  RAM Accesses:              175467 (+0.031925%)
  Estimated Cycles:      2699705149 (+0.021477%)


@orottier
Copy link
Owner Author

Okay, with the final version I think the performance regression is acceptable

@b-ma
Copy link
Collaborator

b-ma commented Oct 20, 2023

Cool looks nice !
I think there are still some places in the code (i.e. AudioParam) where we check this manually, I will try to have a look soon

(for info, I will be completely off next week)

@orottier orottier merged commit 8f17ce9 into main Oct 20, 2023
3 checks passed
@orottier orottier deleted the feature/no-denormals branch October 20, 2023 16:42
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

How to deal with subnormal floats?
2 participants