Releases: cyclops-community/ctf
Cyclops v1.5.5
Release includes new functionality:
- search for best tree contraction ordering
- symmetric eigensolve interface to ScaLAPACK
- new python-level benchmarks for TTM/TTTP/MTTKRP
but mostly numerous bug fixes:
- extensive corrections to CCSR (hypersparse formats)
- important corrections to mapping logic for expensive large-scale tensor contractions
- adjustments to performance models, fix to read in CTF_MODEL_FILE
- use of mallinfo for determination of actual amount of overall used memory (turned of with -DNOMALLINFO)
Cyclops v1.5.4
Release includes bug fixes and a number of major functionality extensions:
- CCSR layout (CSR with compression of row counts to nonzero rows) is now supported for sparse=sparse*dense contractions
- TTTP (tensor times tensor product) is now supported, which enables efficient handling of contractions like V["ijk"] = U["ijk"]*A["jr"]*B["kr"], to arbitrary order with operands like A and B for any selection of modes in U
- Cholesky and triangular solve are now directly interfaced to ScaLAPACK, also available on python layer
- reshape functionality is now available at C++ level
- sparse tensor I/O is now available
- a randomized SVD routine has been added to C++ and Python level
- option to threshold singular values has been added to regular SVD
- SVD can now be called explicitly on tensors, with modes selected for the left and right singular vectors via strings
- a light-weight contraction execution time model has been added
- contraction of a sequence of tensors is now done in the best linear ordering for up to 8 tensors (this will be improved in the near future to the best tree ordering)
- depending on the predicted sparsity of the output, sparse intermediates are now automatically defined and used within a sequence of contractions, including for sparse*dense contractions where the sparse tensor has fewer nonzeros than rows in the corresponding matrix unfolding
- various bug fixed have been made to the python interface, in particular, sparsity is now maintained and scaling is handled more properly
Cyclops v1.5.3
Release focuses on bug fixes and minor functionality extensions
- better integration and support of ScaLAPACK routines
- bug fixes to build systems
- bug fixes to int64_t overflows for very large processor counts
- bug fixes to Hadamard products for sparse tensors
Cyclops v1.5.2
This release introduces new functionality and interface changes. The changes are consequential to performance, especially for sparse tensors and non-standard element types.
- data is now stored, accepted, and returned as an array of objects (i.e. created by new rather than malloc),
read_local
remains as before (memory should be released by free), but new routines have been introduced:get_local_data
andget_local_pairs
, which return data that should be released withdelete
- as consequence of (1), more general object types are now supported for sequential execution
- via (2) it is now possible to leverage block sparsity, by defining a sparse tensor on
MPI_COMM_SELF
whose elements are dense tensors onMPI_COMM_WORLD
(or any desired communicator), which yields bulk-synchronous execution of each block contraction and a memory-efficient block-sparse layout (seeexamples/block_sparse.cxx
) - QR and SVD are now available in C++ and Python, supporting real/complex in single/double precision
- the build system has been improved to fix bugs and improve robustness
Cyclops v1.5.1
Bug-fixing and functionality expansion release for Cyclops v1.5. Major fixes include the following,
slice
andpermute
corrected at C++ level for overwriting existing data with zeros as appropriate.__setitem__
and__getitem__
now work correctly consistently withnumpy
for a number of cases and permit strided slices.- type conversions automatically handled in python
- test suite for python has been significantly extended
print
function corrected for writing data to file as opposed tostdout
- symmetry handled correctly at C++ and python level for slicing, although resulting slice is always declared nonsymmetric
dot
function for python corrected- universal functions in python implemented and working
Cyclops v1.5.0
The v1.5.0 release includes the following developments:
- Python support
- via Cython, also requires numpy
ctf.tensor
interface followsnumpy.ndarray
- see updated github page and docs for functionality supported
- ScaLAPACK interoperability
- extension and fixes for conversion functions between ScaLAPACK matrices and cyclops matrices
- simple interface to ScaLaPACK SVD for cyclops matrices
- automatic ScaLAPACK build via --build-scalapack flag
- HPTT support
- tensor transposition performance improved drastically when building with the high performance tensor transpose (HPTT) library developed by Paul Springer
- tensor transposition semantics changed a bit for contractions
- HPTT can be automatically build with configure flag --build-hptt
- Hadamard products and batched BLAS
- pure Hadamard products like
c["ij"]+=a["ij"]*b["ij"];
recognized and done by simple loops - batched BLAS used when possible, including transposition to appropriate ordering of tensor modes
- fast batched BLAS routines supported via MKL library
- pure Hadamard products like
Version 1.4.2
various bug fixes, support for input/output from block-cyclic (ScaLAPACK descriptor) layouts, reorganization of examples and addition of new ones (spectral element, algebraic multigrid, correct FFT)
CTF v1.4.1: major bug fixes to memory accounting in sparse contractio…
…ns and construction of tensors with predefined processor grid mapping
Substantially extended sparse contraction capabilities
It is now possible to contract two sparse tensors, into a potentially sparse output. The release also includes bug fixes, improvements to start-up time at larger parallel scale, capability for writing tensors to disk via MPI_I/O, and improvements for multitype contraction/summation support. Semantics of Transform with sparse output changed for summations to only modify existing sparse elements rather than create new nonzeros.
Strong and weak scalability plots of this version for the sparse MP3 code in examples/sparse_mp3.cxx are below
spmp3_ss_edison_jul2016.pdf
spmp3_ws_edison_jul2016.pdf