Use autodiff in screening routines #1588

yut23 · 2024-06-21T20:16:08Z

This PR gives the same results for test_screening to within roundoff, and the performance is about the same or slightly better. It also adds a page to the docs about how to use the autodiff library.

One notable change is that the templated networks don't calculate the derivative terms when screening is called from RHS::rhs() (this is also how the pynucastro networks behave). Previously, they would be calculated unnecessarily if integrator.jacobian was set to 1.

Needed for AMReX-Astro/Microphysics#1588

yut23 · 2024-06-26T03:06:44Z

Performance comparisons, using test_screening_templated with aprox21, best of 10 runs
CPU, with derivatives (n_cell=128, loops=1):

- screen5:        3.43s ->  3.37s (-0.05s,  -2%)
- chugunov2007:   9.47s ->  9.34s (-0.13s,  -1%)
- chugunov2009:  18.53s -> 16.58s (-1.95s, -11%)
- chabrier1998:   5.70s ->  5.15s (-0.55s, -10%)
  with q.c.:      6.49s ->  5.82s (-0.67s, -10%)

CPU, without derivatives (n_cell=128, loops=1):

- screen5:        2.71s ->  2.83s (+0.12s, +4%)
- chugunov2007:   8.47s ->  8.50s (+0.03s, +0%)
- chugunov2009:  16.41s -> 15.02s (-1.39s, -8%)
- chabrier1998:   4.43s ->  4.44s (+0.02s, +0%)
  with q.c.:      4.63s ->  4.59s (-0.04s, -1%)

CUDA, with derivatives (CUDA_LTO=TRUE, n_cell=64, loops=100):

- screen5:        1.70s ->  1.77s (+0.07s,  +4%)
- chugunov2007:   3.33s ->  2.62s (-0.71s, -21%)
- chugunov2009:   6.98s ->  7.84s (+0.86s, +12%)
- chabrier1998:   3.51s ->  3.00s (-0.51s, -15%)
  with q.c.:      4.11s ->  3.42s (-0.69s, -17%)

CUDA, without derivatives (CUDA_LTO=TRUE, n_cell=64, loops=100):

- screen5:        1.36s ->  1.38s (+0.02s, +1%)
- chugunov2007:   2.17s ->  2.15s (-0.02s, -1%)
- chugunov2009:   5.21s ->  5.21s (+0.00s, +0%)
- chabrier1998:   2.12s ->  2.12s (-0.00s, -0%)
  with q.c.:      2.34s ->  2.34s (+0.00s, +0%)

yut23 · 2024-06-26T15:32:36Z

Needs #1593

Only unit_test/test_screening_templated compiles at the moment.

They should only be needed in `RHS::jac()`.

The extra precision of std::log1p only matters at very small gamma, and will be completely overshadowed by the imprecision of fast_atan().

zingale · 2024-06-27T14:15:18Z

test suite run:
http://groot.astro.sunysb.edu/Microphysics/test-suite/gfortran/2024-06-27-001/index.html

zingale · 2024-06-28T12:05:30Z

all the Jacobian diffs in the test suite are roundoff level, so this seems to be working well.

Needed for AMReX-Astro/Microphysics#1588

yut23 added a commit to yut23/pynucastro that referenced this pull request Jun 22, 2024

Update C++ codegen for autodiff screening

84fc7fe

Needed for AMReX-Astro/Microphysics#1588

yut23 added a commit to yut23/pynucastro that referenced this pull request Jun 22, 2024

Update C++ networks for autodiff screening

e65defb

Needed for AMReX-Astro/Microphysics#1588

yut23 mentioned this pull request Jun 22, 2024

Update C++ networks for autodiff screening pynucastro/pynucastro#752

Merged

yut23 force-pushed the autodiff_screening branch 2 times, most recently from 039dfe0 to c46c04d Compare June 25, 2024 23:12

yut23 marked this pull request as ready for review June 26, 2024 02:45

yut23 force-pushed the autodiff_screening branch from cd77340 to 0f83ed6 Compare June 26, 2024 15:32

yut23 added 15 commits June 26, 2024 15:25

Use autodiff in screening routines

c904fd0

Only unit_test/test_screening_templated compiles at the moment.

Fix loop variable

0d37a65

Update the templated networks to handle autodiff screening

3594be3

Update test_screening

f897456

Manually update subch_simple to check against my pynucastro changes

e2f4e50

Update the rest of the pynucastro networks

1689bdf

Update benchmarks

7e73642

Add autodiff handler for fast_atan()

f6dd922

Don't calculate derivatives in RHS::rhs()

e7314bf

They should only be needed in `RHS::jac()`.

Update benchmarks

03b7419

Set scordt to NaN if not calculating derivatives in actual_screen

49bab6e

std::log1p is a bit slower than std::log under CUDA

785a69f

The extra precision of std::log1p only matters at very small gamma, and will be completely overshadowed by the imprecision of fast_atan().

Simplify the usage of autodiff::seed and autodiff::derivative

1a5acb7

Include derivatives when printing plasma_state_t

cabc816

Remove unused term from screen5

3758386

yut23 force-pushed the autodiff_screening branch from 7f43311 to 3758386 Compare June 26, 2024 19:25

Merge branch 'development' into autodiff_screening

6a78969

zingale approved these changes Jun 28, 2024

View reviewed changes

zingale merged commit 3ff348f into AMReX-Astro:development Jun 28, 2024
29 checks passed

yut23 deleted the autodiff_screening branch July 1, 2024 18:37

zingale pushed a commit to pynucastro/pynucastro that referenced this pull request Jul 8, 2024

Update C++ networks for autodiff screening (#752)

a40215a

Needed for AMReX-Astro/Microphysics#1588

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Use autodiff in screening routines #1588

Use autodiff in screening routines #1588

yut23 commented Jun 21, 2024 •

edited

Loading

yut23 commented Jun 26, 2024

yut23 commented Jun 26, 2024

zingale commented Jun 27, 2024

zingale commented Jun 28, 2024

Use autodiff in screening routines #1588

Use autodiff in screening routines #1588

Conversation

yut23 commented Jun 21, 2024 • edited Loading

yut23 commented Jun 26, 2024

yut23 commented Jun 26, 2024

zingale commented Jun 27, 2024

zingale commented Jun 28, 2024

yut23 commented Jun 21, 2024 •

edited

Loading