10 Feb 10:34

felipeZ

Version 0.4.0 Pre-release

Pre-release

Offload the Eigen3 matrix-matrix multiplication to an Nvidia GPU using CUBLAS.

Changed

Split the memory management (CudaMatrix) from the CUBLAS invocation (CudaPipeline)
Moved all the allocation to the smart pointers inside CudaMatrix
Removed unused headers

Assets 2

26 Sep 09:24

felipeZ

Introduced smart pointer Pre-release

Pre-release

[0.3.0] 26/09/2019

Added

Smart pointers to handle cuda resources
New CudaMatrix class
Check available memory in the GPU before computing

Removed

Template class, implementation only available for double
Triple tensor product
Shapes struct

Assets 2

29 Aug 12:16

felipeZ

Tensor matrix multiplication Pre-release

Pre-release

[0.2.0] 27/08/2019

Added

Tensor matrix multiplacation using gemmbatched
Async calls to memory copies.
Properly free memory after the tensor operation is done.

Assets 2