Skip to content

rocFFT 1.0.12 for ROCm 4.3.0

Compare
Choose a tag to compare
@saadrahim saadrahim released this 30 Jul 22:53
b93c40c

Changed

Re-split device code into single-precision, double-precision, and miscellaneous kernels.

Fixed

  • Fixed potential crashes in double-precision planar->planar transpose.

Added

  • Added new kernel generator for select lengths. New kernels have
    improved performance.
  • Added public rocfft_execution_info_set_load_callback and
    rocfft_execution_info_set_store_callback API functions to allow
    executing extra logic when loading/storing data from/to global
    memory during a transform.

Removed

  • Removed R2C pair schemes and kernels.

Optimizations

  • Optimized 2D/3D R2C 100 and 1D Z2Z 2500.
  • Reduced number of kernels for 2D/3D sizes where higher dimension is 64, 128, 256.

Fixed

  • Fixed potential crashes in 3D transforms with unusual strides, for
    SBCC-optimized sizes.