v0.4.0
The 0.4 release adds support for pretrained models to the library via keras_nlp.models
. You can read an
introduction to the new API in our Getting Started Guide.
If you encounter any problems or have questions, please open an issue!
Breaking Changes
- Renamed
keras_nlp.layers.MLMHead
->keras_nlp.layers.MaskedLMHead
. - Renamed
keras_nlp.layers.MLMMaskGenerator
->keras_nlp.layers.MaskedLMMaskGenerator
. - Renamed
keras_nlp.layers.UnicodeCharacterTokenizer
->keras_nlp.layers.UnicodeCodepointTokenizer
. - Switched the default of
lowercase
inkeras_nlp.tokenizers.WordPieceTokenizer
fromTrue
toFalse
. - Renamed the token id output of
MaskedLMMaskGenerator
from"tokens"
to"tokens_ids"
.
Summary
- Added the
keras_nlp.models
API.- Added support for BERT, DistilBERT, RoBERTa, and XLM-RoBERTa models and pretrained checkpoints.
- See our Getting Started Guide for more details.
- Added new metrics.
keras_nlp.metrics.Bleu
andkeras_nlp.metrics.EditDistance
.
- Added new vocabulary training utilities.
keras_nlp.tokenizers.compute_word_piece_vocabulary
andkeras_nlp.tokenizers.compute_sentence_piece_proto
.
- Added new preprocessing layers.
keras_nlp.layers.RandomSwap
andkeras_nlp.layers.RandomDeletion
.
What's Changed
- Add Edit Distance Metric by @abheesht17 in #231
- Minor fix to simplify and test handling of max_length prompts by @jbischof in #258
- Remove split regex args for WordPieceTokenizer by @mattdangerw in #255
- Add instructions on installing the latest changes by @mattdangerw in #261
- Add warning when k > vocab_size in top_k_search by @jbischof in #260
- Fix keras library imports and usage by @jbischof in #262
- Add BLEU Score by @abheesht17 in #222
- Configure GKE-based accelerator testing by @chenmoneygithub in #265
- Added WordPieceTokenizer training function by @jessechancy in #256
- Add requirements.txt for cloud build by @chenmoneygithub in #267
- Global Seed Bug Fix by @jessechancy in #269
- Update accelerator testing to use the new GCP project by @chenmoneygithub in #272
- Fixed typo: "recieved" by @ehrencrona in #273
- Reuse dense pooled output for fine tuning by @mattdangerw in #251
- Simplify BERT modeling, use keras embeddings by @mattdangerw in #253
- Rename UnicodeCharacterTokenizer>UnicodeCodepointTokenizer by @mattdangerw in #254
- Add README for accelerator testing config folder by @chenmoneygithub in #276
- Random Deletion Layer by @aflah02 in #214
- Made trainer more efficient. Loading full files instead of using TextLineDataset. by @jessechancy in #280
- Use KerasNLP for BERT preprocessing for GLUE by @mattdangerw in #252
- Minor fixes to the Random Deletion Layer by @aflah02 in #286
- Fixes for WordPieceTrainer by @aflah02 in #293
- Update default to strip_accents=False by @jessechancy in #289
- Move Bert to models folder by @jbischof in #288
- Make Decoding Functions Graph-compatible (with XLA Support!) by @abheesht17 in #271
- SentencePieceTrainer by @aflah02 in #281
- Rename
models.Bert()
tomodels.BertCustom()
by @jbischof in #310 - Add a test for variable sequence length inputs by @mattdangerw in #313
- Support checkpoint loading for
BertBase
by @jbischof in #299 - RoBERTa pretrained model forward pass by @jessechancy in #304
- Register objects as serializable by @mattdangerw in #292
- Style merging for Bert and Roberta by @jbischof in #315
- Streamline and speed up tests by @jbischof in #324
- Add Support for CJK Char Splitting for WordPiece Tokenizer by @abheesht17 in #318
- Clean up model input names for consistency by @mattdangerw in #327
- Return a single tensor from roberta by @mattdangerw in #328
- BERT, RoBERTa: Add
model.compile
UTs by @abheesht17 in #330 - Continue rename of bert model inputs by @mattdangerw in #329
- Text Generation Utilities: Add Support for Ragged Inputs by @abheesht17 in #300
bert_base_zh
,bert_base_multi_cased
: Add BERT Base Variants by @abheesht17 in #319- WordPiece vocabularies trainer on Wikipedia dataset by @jessechancy in #316
- Use the exported ragged ops for RandomDeletion by @mattdangerw in #332
- Random Swap Layer by @aflah02 in #224
- Fixes for Random Deletion Layer by @aflah02 in #339
- Move cloudbuild to a hidden directory by @mattdangerw in #345
- Fix the build by @mattdangerw in #349
- Migrating from Datasets to TFDS for GLUE Example by @aflah02 in #340
- Move network_tests into keras_nlp/ by @mattdangerw in #344
- Stop hardcoding 2.9 by @mattdangerw in #351
- Add BERT Large by @abheesht17 in #331
- Add normalize_first arg to Transformer Layers by @abheesht17 in #350
- Add Small BERT Variants by @abheesht17 in #338
- Beam Search: Add Ragged and XLA Support by @abheesht17 in #341
- Fix download paths for bert weights by @mattdangerw in #356
- Add a BertPreprocessor class by @mattdangerw in #343
- Text Generation Functions: Add Benchmark Script by @abheesht17 in #342
- Improve readability for encoder/decoder blocks by @mattdangerw in #353
- Add GPT-2 Model and its Variants by @abheesht17 in #354
- Clean up BERT, RoBERTa doc-strings by @abheesht17 in #359
- Create unique string id for each BERT backbone by @jbischof in #361
- Use model.fit() for BERT Example by @abheesht17 in #360
- Minor Fixes in BertPreprocessor Layer by @abheesht17 in #373
- Clone user passed initializers called multiple times by @mattdangerw in #371
- Update BERT model file structure by @mattdangerw in #376
- Move gpt model code into a directory by @mattdangerw in #379
- Move roberta model code into a directory by @mattdangerw in #380
- Reorg test directories by @mattdangerw in #384
- Add XLM-RoBERTa by @abheesht17 in #372
- Add DistilBERT by @abheesht17 in #382
- Stop running CI on Windows by @mattdangerw in #386
- Fix Bert serialization by @mattdangerw in #385
- Improve MacOS support and pin tensorflow version during testing by @mattdangerw in #383
- Unify BERT model API in one class by @jbischof in #387
- Add
from_preset
constructor toBertPreprocessor
by @jbischof in #390 - More robustly test BERT preprocessing by @mattdangerw in #394
- Move
name
andtrainable
tokwargs
by @jbischof in #399 - Add
backbone
asproperty
for task models by @jbischof in #398 - Set default name of
Bert
instance to"backbone"
by @jbischof in #397 - Fix gpt2 serialization by @mattdangerw in #391
- Fix distilbert serialization by @mattdangerw in #392
- Fix roberta and xlm-roberta serialization by @mattdangerw in #393
- Register the BertPreprocessor as serializable by @mattdangerw in #401
- BPE tokenizer by @chenmoneygithub in #389
- Change GPT-2's Format to Mirror BERT's by @abheesht17 in #418
- Fix bert preprocessing docstring so it is runnable by @mattdangerw in #421
- Change RoBERTa and XLM-RoBERTa's Format to Mirror BERT's by @abheesht17 in #417
- Update distilbert to mirror recent bert changes by @mattdangerw in #406
- Change gpt2 to GPT2 by @sampathweb in #425
- Fix byte pair detokenization of 2d arrays by @mattdangerw in #423
- Never pass Raggeds to user function when generating text by @mattdangerw in #424
- Add XLMRobertaClassifier by @abheesht17 in #422
- Add RobertaPreprocessor Layer by @abheesht17 in #419
- Update Style Guide for naming of Models and Layers by @sampathweb in #434
- Support String Output for BytePairTokenizer by @abheesht17 in #438
- Improve our continuous testing for model presets by @mattdangerw in #357
- Remove remote files from BPE docstring by @jbischof in #440
- Add DistilBertClassifier by @abheesht17 in #437
- Remove lingering reference to BertCustom by @mattdangerw in #441
- Add XLM-RoBERTa Tokenizer (SPM) by @abheesht17 in #428
- Add a disclaimer for use of model checkpoints by @mattdangerw in #430
- Add a disclaimer to our README by @mattdangerw in #431
- Fix our BERT GLUE example so it runs again by @mattdangerw in #444
- Add backbone presets to task classes by @jbischof in #448
- Split the Bert tokenizer to a separate class by @mattdangerw in #449
- Conditionally import tf text by @mattdangerw in #452
- Copy our model disclaimer to the distilbert classifier by @mattdangerw in #453
- Fix regex string for BPE by @chenmoneygithub in #458
- Fix docstrings and add note to style guide by @jbischof in #464
- Allow formatting our docstrings inline by @mattdangerw in #450
- Update self.assertEquals with self.assertEqual by @MaximSmolskiy in #466
- Document our release process by @mattdangerw in #473
- Add RobertaTokenizer by @abheesht17 in #468
- Add DistilBertTokenizer by @abheesht17 in #469
- Modify XLMRobertaTokenizer to Match BERT by @abheesht17 in #471
- Add GPT2Tokenizer by @abheesht17 in #470
- Minor fix to git commands by @mattdangerw in #475
- Version bump to 0.4.0 by @mattdangerw in #476
- Clarify comment on BERT preset testing by @mattdangerw in #477
- Fix the nightly build by @mattdangerw in #484
- Bump tf and tf-text to 2.11 by @mattdangerw in #490
- Consolidate preset testing by @mattdangerw in #480
- Allow BertPreprocessor to map labeled datasets by @mattdangerw in #478
- Glue eval script by @chenmoneygithub in #445
- Update Requirements and Python version in setup.py by @sampathweb in #495
- First task-level preset with
BertClassifier
by @jbischof in #494 - Add a helper model to automatically apply preprocessing by @mattdangerw in #346
- Add GPT2 Presets by @abheesht17 in #472
- fix incorrect flag by @chenmoneygithub in #496
- Add instructions on how to update deps of GPU testing by @chenmoneygithub in #499
- Add XLM-RoBERTa Presets by @abheesht17 in #482
- Add DistilBERT Presets by @abheesht17 in #479
- Add RoBERTa Presets by @abheesht17 in #506
- Fix nightly builds by @mattdangerw in #522
- Remove typo by @jbischof in #515
- Fix Model Doc-string Examples by @abheesht17 in #516
- Mark format.sh executable again by @mattdangerw in #518
- Make BertClassifier operate directly on raw string inputs by @mattdangerw in #485
- Fix nightlies take two by @mattdangerw in #525
- Use
tf.ones
for docstring example input by @jbischof in #524 - Fix the index order of GLUE script and other bugs by @chenmoneygithub in #517
- Standardize on "backbone" naming for BERT by @jbischof in #536
- Add dropout to BertClassifier by @chenmoneygithub in #540
- Preprocess string lists as a batch of single segments by @mattdangerw in #504
- Update model and file names for DistilBert by @ADITYADAS1999 in #541
- Rename filenames in models/ to match classnames by @mattdangerw in #548
- Split tokenizers into their own file by @mattdangerw in #549
- Add a note about our new file naming conventions by @mattdangerw in #553
- Add distribution support for GLUE script by @chenmoneygithub in #544
- Make out preset ids more consistent by @mattdangerw in #552
- Rename DistilBert -> DistilBertBackbone by @mattdangerw in #551
- Rename Roberta -> RobertaBackbone (and same for XLM*) by @mattdangerw in #550
- Fix link in glue_benchmark README by @mattdangerw in #557
- Remove qualification for
PRESET_NAMES
by @jbischof in #554 - Use
black[jupyter]
to format notebooks by @jbischof in #556 - Temporarily drop GPT2 from our init.py by @mattdangerw in #560
- Replicate #536 changing GPT2 -> GPT2Backbone by @ADITYADAS1999 in #558
- Change Backbone Names by @abheesht17 in #559
- Raise a friendly error message for unbatched input by @mattdangerw in #545
- Fix Typo in Backbone Doc-strings by @abheesht17 in #565
- Fix LinearDecayWithWarmup crash in BERT model. by @reedwm in #564
- Update XLMRobertaPreprocessor to mirror recent changes by @mattdangerw in #568
- Update RobertaPreprocessor to mirror recent changes by @mattdangerw in #567
- Fix minor typo in BertPreprocessor layer by @mattdangerw in #569
- Update DistilBertPreprocessor to mirror recent changes by @mattdangerw in #566
- Point presets to url containing "v1/" by @chenmoneygithub in #577
- Stop testing h5 saved model format, start testing keras_v3 by @mattdangerw in #521
- fix the gpu testing by @chenmoneygithub in #581
- Make DistilBertClassifier operate directly on raw string inputs by @mattdangerw in #578
- Make RobertaClassifier operate directly on raw string inputs by @chenmoneygithub in #579
- Export XLM-Roberta Classifier by @mattdangerw in #580
- Make XLMRobertaClassifier operate directly on raw string inputs by @mattdangerw in #583
- Fix some misc typos for distilbert by @mattdangerw in #584
- Report GLUE score and hyperparameter settings by @chenmoneygithub in #585
- Remove support for Python 3.7 to align with keras-nightly by @sampathweb in #590
- Add DeBERTa v3 Model by @abheesht17 in #435
- Add DeBERTa Tokenizer and Preprocessor Classes by @abheesht17 in #589
- Add Dropout to *Classifier Doc-strings by @abheesht17 in #595
- Rename MLM -> MaskedLM for all library symbols by @mattdangerw in #598
- File-level Doc-string Changes for Classifiers and Presets by @abheesht17 in #604
- Add DebertaClassifier and DeBERTa Presets by @abheesht17 in #594
- Rename Deberta -> DebertaV3 by @mattdangerw in #605
- Version bump to 0.4.0.dev0 by @mattdangerw in #609
- Add mixed precision support for glue script by @chenmoneygithub in #608
- Remove deberta from 0.4 release by @chenmoneygithub in #607
- Update README for v0.4 by @jbischof in #588
- Remove the dev prefix for final release by @mattdangerw in #613
- Rename preset IDs for consistency by @mattdangerw in #612
New Contributors
- @jbischof made their first contribution in #258
- @ehrencrona made their first contribution in #273
- @sampathweb made their first contribution in #425
- @MaximSmolskiy made their first contribution in #466
- @ADITYADAS1999 made their first contribution in #541
- @reedwm made their first contribution in #564
Full Changelog: v0.3.0...v0.4.0