Release Marian v1.9.0 · marian-nmt/marian-dev

An option to print cached variables from CMake
Add support for compiling on Mac (and clang)
An option for resetting stalled validation metrics
Add CMAKE options to disable compilation for specific GPU SM types
An option to print word-level translation scores
An option to turn off automatic detokenization from SentencePiece
Separate quantization types for 8-bit FBGEMM for AVX2 and AVX512
Sequence-level unliklihood training
Allow file name templated valid-translation-output files
Support for lexical shortlists in marian-server
Support for 8-bit matrix multiplication with FBGEMM
CMakeLists.txt now looks for SSE 4.2
Purging of finished hypotheses during beam-search. A lot faster for large batches.
Faster option look-up, up to 20-30% faster translation
Added --cite and --authors flag
Added optional support for ccache
Switch to change abort to exception, only to be used in library mode
Support for 16-bit packed models with FBGEMM
Multiple separated parameter types in ExpressionGraph, currently inference-only
Safe handling of sigterm signal
Automatic vectorization of elementwise operations on CPU for tensors dims that
are divisible by 4 (AVX) and 8 (AVX2)
Replacing std::shared_ptr with custom IntrusivePtr for small objects like
Tensors, Hypotheses and Expressions.
Fp16 inference working for translation
Gradient-checkpointing

Replace value for INVALID_PATH_SCORE with std::numer_limits::lowest()
to avoid overflow with long sequences
Break up potential circular references for GraphGroup*
Fix empty source batch entries with batch purging
Clear RNN chache in transformer model, add correct hash functions to nodes
Gather-operation for all index sizes
Fix word weighting with max length cropping
Fixed compilation on CPUs without support for AVX
FastOpt now reads "n" and "y" values as strings, not as boolean values
Fixed multiple reduction kernels on GPU
Fixed guided-alignment training with cross-entropy
Replace IntrusivePtr with std::uniq_ptr in FastOpt, fixes random segfaults
due to thread-non-safty of reference counting.
Make sure that items are 256-byte aligned during saving
Make explicit matmul functions respect setting of cublasMathMode
Fix memory mapping for mixed paramter models
Removed naked pointer and potential memory-leak from file_stream.{cpp,h}
Compilation for GCC >= 7 due to exception thrown in destructor
Sort parameters by lexicographical order during allocation to ensure consistent
memory-layout during allocation, loading, saving.
Output empty line when input is empty line. Previous behavior might result in
hallucinated outputs.
Compilation with CUDA 10.1

Provide feedback