Skip to content
This repository has been archived by the owner on Jan 15, 2024. It is now read-only.

v0.9.0: BERT Inference Time Cut by Half and 90% Scaling Efficiency for Distributed Training

Compare
Choose a tag to compare
@leezu leezu released this 10 Feb 18:52
· 12 commits to v0.9.x since this release

News

Models and Scripts in v0.9

BERT

INT8 Quantization for BERT Sentence Classification and Question Answering
(#1080)! Also Check out the blog post.

Enhancements to the pretraining script (#1121, #1099) and faster tokenizer for
BERT (#921, #1024) as well as multi-GPU support for SQuAD fine-tuning (#1079).

Make BERT a HybridBlock (#877).

XLNet

The XLNet model introduced by Yang, Zhilin, et. al in
"XLNet: Generalized Autoregressive Pretraining for Language Understanding".
The model was converted from the original repository (#866).

GluonNLP further provides scripts for finetuning XLNet on the Glue (#995) and
SQuAD datasets (#1130) that reproduce the authors results. Check out the usage.

DistilBERT

The DistilBERT model introduced by Sanh, Victor, et. al in
"DistilBERT, a distilled version of BERT: smaller, faster, cheaper and lighter" (#922).

Transformer

Add a separate Transformer inference script to make inference easy and make it
convenient to analysis the performance of transformer inference (#852).

Korean BERT

Pre-trained Korean BERT is available as part of GluonNLP (#1057)

RoBERTa

GluonNLP now provides scripts for finetuning RoBERTa (#931).

GPT2

GPT2 is now a HybridBlock the model can be exported for running from other MXNet
language bindings (#1010).

New Features

  • Add NamedTuple + Dict batchify (#959)
  • Add even_size option to split sampler (#1028)
  • Add length normalized metrics for machine translation tasks (#1095)
  • Add raw attention scores to the AttentionCell #951 (#964)
  • Add round_to feature to BERT & XLNet finetuning scripts (#1133)
  • Add stratified train_valid_split similar to sklearn.model_selection.train_test_split (#933)
  • Add SuperGlue dataset API (#858)
  • Add Multi Model Server deployment code example for developers (#1140)
  • Allow custom dropout, number of layers/units for BERT (#950)
  • Avoid race condition when downloading vocab (#1078)
  • Deprecate specifying Vocab padding, bos and eos_token as positional arguments (#945)
  • Fast multitensor adam optimizer (#1111)
  • Faster grad_global_norm for clipping (#1115)
  • Hybridizable AWDRNN/StandardRNN (#911)
  • Padding seq length to multiple of 8 in BERT model (#909)
  • Scripts for producing the figures that explain the bucketing strategy (#908)
  • Split up Seq2SeqDecoder in Seq2SeqDecoder and Seq2SeqOneStepDecoder (#976)
  • Switch CI to Python 3.5 and declare Python 3.5 support (#1009)
  • Try to use the new None feature in MXNet + Drop support for MXNet 1.5 (#967)
  • Use fused gelu operator (#1082)
  • Use softmax with length, and interleaved matmul for BERT (#1136)
  • Documentation of Model Conversion Scripts at https://gluon-nlp.mxnet.io/v0.9.x/model_zoo/conversion_tools/index.html (#922)

Bug Fixes and code cleanup

  • Add version checker to all scripts (#930)
  • Add version checker to all tutorials (#934)
  • Add 'packaging' to requirements (#1143)
  • Adjust code owner (#923)
  • Avoid using dict for attention cell parameter creation (#1050)
  • Bump version in preparation for 0.9 release (#987)
  • Change SimVerb3500 URL to aclweb hosted version (#979)
  • Correct propagation of error codes in GluonNLP-py3-master-gpu-doc (#971)
  • Corrected np.random.randint upper limit in data.stream.py (#935)
  • Declare Python version requirement in setup.py (#927)
  • Declare more optional dependencies (#958)
  • Declare pytest seed marker in pytest.ini (#940)
  • Disable HybridBeamSearch (#1021)
  • Drop LAMB optimizer from GluonNLP in favor of MXNet version (#1116)
  • Drop unused compatibility helpers and fix doc (#928)
  • Fix #905 (#906)
  • Fix a SQuAD 2.0 evaluation bug (#907)
  • Fix argument analogy-max-vocab-size (#904)
  • Fix broken multi-head attention cell (#878)
  • Fix bugs in BERT export script (#944)
  • Fix chnsenticorp dataset download link (#873)
  • Fix file sampler for BERT (#977)
  • Fix index.rst and gpu flag in machine translation (#952)
  • Fix log in finetune_squad.py (#1001)
  • Fix parameter sharing of WeightDropParameter (#1083)
  • Fix scripts/question_answering/data_pipeline.py requiring optional package (#1013)
  • Fix the weight tie and weight sharing for AWDRNN (#1087)
  • Fix training command in Language Modeling index.rst (#1100)
  • Fix version check in train_gnmt.py and train_transformer.py (#1003)
  • Fix standard rnn weight sharing error (#1122)
  • Glue data preprocessing pipeline and bert & xlnet scripts (#1031)
  • Improve Vocab.repr if reserved_tokens or unknown_token is None (#989)
  • Improve readability (#975)
  • Improve test robustness (#960)
  • Improve the readability of the training script. This fix replaces magic numbers with the name (#1006)
  • Make EmbeddingCenterContextBatchify returned dtype robust to empty sentences (#954)
  • Modify the log average loss (#1103)
  • Move ICSL script out of BERT folder (#1131)
  • Move NER script out of bert folder (#1090)
  • Move ParallelBigRNN into nlp.model namespace (#1118)
  • Move get_rnn_cell out of seq2seq_encoder_decoder (#1073)
  • Mxnet version check (#1063)
  • Refactor BERT with new data preprocessing (#1124)
  • Remove NLTKMosesTokenizer in favor of SacreMosesTokenizer (#942)
  • Remove extra dropout in BERT/RoBERTa (#1022)
  • Remove outdated comment (#943)
  • Remove padding warning (#916)
  • Replace unicode comma with ascii comma (#1056)
  • Split up inheritance structure of TransformerEncoder and BERTEncoder (#988)
  • Support int32 for sampled blocks (#1106)
  • Switch batch jobs to use G4dn.2x instance (#1041)
  • TransformerXL LayerNorm eps and XLNet pretrained model config (#1005)
  • Unify BERT horovod and kvstore pre-training script (#889)
  • Update README.rst (#884)
  • Update data_api.rst (#893)
  • Update embedding script (#1046)
  • Update fp16_utils.py (#1037)
  • Update index.rst (#876)
  • Update index.rst (#891)
  • Update navbar install (#983)
  • Update numba dependency in setup.py (#941)
  • Update outdated contributor list (#963)
  • Update prepare_clean_env.sh (#998)

Documentation

  • Add comment to BERT notebook (#1026)
  • Add missing docs for nlp.utils (#936)
  • Add more documentation to XLNet scripts (#985)
  • Add section for "Clone the master branch for development" (#1075)
  • Add to toc tree depth to enable multiple level menu (#1108)
  • Cite source of pretrained parameters for bert_12_768_12 (#915)
  • Doc fix for vocab.subwords (#885)
  • Enhance vocab not found err msg (#917)
  • Fix command line examples for text classification (#874)
  • Fix math formula in docs (#920)
  • More detailed doc for CorpusBPTTBatchify (#888)
  • Release checklist (#890)
  • Remove non-existent arguments for BERT and Transformer (#946)
  • Remove py3 usage from the doc (#1077)
  • Update installation guide with selectors (#966)
  • Update mxnet version in installation doc (#1072)
  • Update pre-trained model link (#1117)
  • Update Installation instructions for source (#1146)

Continuous Integration

  • Disable SimVerb test for 14 days (#953)
  • Disable horovod test temporarily (#1030)
  • Disable known bad mxnet nightly version (#997)
  • Enable integration tests on CPU (#957)
  • Enable testing warnings with pytest and update deprecated API invocations (#980)
  • Enable timestamp in CI (#925)
  • Enable type checks and inference with pytype (#1018)
  • Fix CI (#875)
  • Preserve stderr and stdout streams in doc CI stage for Cloudwatch (#882)
  • Remove skip_master feature (#1017)
  • Switch source of MXNet nightly build (#1058)
  • Test MXNet 1.6 pre-release as part of CI pipeline (#1023)
  • Update MXNet master version tested on CI (#1113)
  • Update numba (#1096)
  • Use Cuda 10.0 MXNet build (#991)