Skip to content

Commit

Permalink
updated changelog
Browse files Browse the repository at this point in the history
  • Loading branch information
felipeZ committed Aug 29, 2019
1 parent 7496e47 commit b12fdda
Showing 1 changed file with 9 additions and 3 deletions.
12 changes: 9 additions & 3 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
@@ -1,7 +1,13 @@
# Change log

# 0.1.0
# [0.2.0] 27/08/2019
### Added
- Tensor matrix multiplacation using [gemmbatched](https://docs.nvidia.com/cuda/cublas/index.html#cublas-lt-t-gt-gemmbatched)
- [Async calls](https://docs.nvidia.com/cuda/cuda-runtime-api/group__CUDART__MEMORY.html#group__CUDART__MEMORY_1g85073372f776b4c4d5f89f7124b7bf79) to memory copies.
- Properly free memory after the tensor operation is done.

# [0.1.0]

### New
* Use a template function to perform matrix matrix multiplacation using [cublas](https://docs.nvidia.com/cuda/cublas/index.html).
* Use either *pinned* (**default**) or *pageable* memory, see [cuda optimizations](https://devblogs.nvidia.com/how-optimize-data-transfers-cuda-cc/).
- Use a template function to perform matrix matrix multiplacation using [cublas](https://docs.nvidia.com/cuda/cublas/index.html).
- Use either *pinned* (**default**) or *pageable* memory, see [cuda optimizations](https://devblogs.nvidia.com/how-optimize-data-transfers-cuda-cc/).

0 comments on commit b12fdda

Please sign in to comment.