Release v0.4.0 · keras-team/keras-hub

The 0.4 release adds support for pretrained models to the library via keras_nlp.models. You can read an
introduction to the new API in our Getting Started Guide.

If you encounter any problems or have questions, please open an issue!

Breaking Changes

Renamed keras_nlp.layers.MLMHead -> keras_nlp.layers.MaskedLMHead.
Renamed keras_nlp.layers.MLMMaskGenerator -> keras_nlp.layers.MaskedLMMaskGenerator.
Renamed keras_nlp.layers.UnicodeCharacterTokenizer -> keras_nlp.layers.UnicodeCodepointTokenizer.
Switched the default of lowercase in keras_nlp.tokenizers.WordPieceTokenizer from True to False.
Renamed the token id output of MaskedLMMaskGenerator from "tokens" to "tokens_ids".

Summary

Added the keras_nlp.models API.
- Added support for BERT, DistilBERT, RoBERTa, and XLM-RoBERTa models and pretrained checkpoints.
- See our Getting Started Guide for more details.
Added new metrics.
- keras_nlp.metrics.Bleu and keras_nlp.metrics.EditDistance.
Added new vocabulary training utilities.
- keras_nlp.tokenizers.compute_word_piece_vocabulary and keras_nlp.tokenizers.compute_sentence_piece_proto.
Added new preprocessing layers.
- keras_nlp.layers.RandomSwap and keras_nlp.layers.RandomDeletion.

What's Changed

Add Edit Distance Metric by @abheesht17 in #231
Minor fix to simplify and test handling of max_length prompts by @jbischof in #258
Remove split regex args for WordPieceTokenizer by @mattdangerw in #255
Add instructions on installing the latest changes by @mattdangerw in #261
Add warning when k > vocab_size in top_k_search by @jbischof in #260
Fix keras library imports and usage by @jbischof in #262
Add BLEU Score by @abheesht17 in #222
Configure GKE-based accelerator testing by @chenmoneygithub in #265
Added WordPieceTokenizer training function by @jessechancy in #256
Add requirements.txt for cloud build by @chenmoneygithub in #267
Global Seed Bug Fix by @jessechancy in #269
Update accelerator testing to use the new GCP project by @chenmoneygithub in #272
Fixed typo: "recieved" by @ehrencrona in #273
Reuse dense pooled output for fine tuning by @mattdangerw in #251
Simplify BERT modeling, use keras embeddings by @mattdangerw in #253
Rename UnicodeCharacterTokenizer>UnicodeCodepointTokenizer by @mattdangerw in #254
Add README for accelerator testing config folder by @chenmoneygithub in #276
Random Deletion Layer by @aflah02 in #214
Made trainer more efficient. Loading full files instead of using TextLineDataset. by @jessechancy in #280
Use KerasNLP for BERT preprocessing for GLUE by @mattdangerw in #252
Minor fixes to the Random Deletion Layer by @aflah02 in #286
Fixes for WordPieceTrainer by @aflah02 in #293
Update default to strip_accents=False by @jessechancy in #289
Move Bert to models folder by @jbischof in #288
Make Decoding Functions Graph-compatible (with XLA Support!) by @abheesht17 in #271
SentencePieceTrainer by @aflah02 in #281
Rename models.Bert() to models.BertCustom() by @jbischof in #310
Add a test for variable sequence length inputs by @mattdangerw in #313
Support checkpoint loading for BertBase by @jbischof in #299
RoBERTa pretrained model forward pass by @jessechancy in #304
Register objects as serializable by @mattdangerw in #292
Style merging for Bert and Roberta by @jbischof in #315
Streamline and speed up tests by @jbischof in #324
Add Support for CJK Char Splitting for WordPiece Tokenizer by @abheesht17 in #318
Clean up model input names for consistency by @mattdangerw in #327
Return a single tensor from roberta by @mattdangerw in #328
BERT, RoBERTa: Add model.compile UTs by @abheesht17 in #330
Continue rename of bert model inputs by @mattdangerw in #329
Text Generation Utilities: Add Support for Ragged Inputs by @abheesht17 in #300
bert_base_zh, bert_base_multi_cased: Add BERT Base Variants by @abheesht17 in #319
WordPiece vocabularies trainer on Wikipedia dataset by @jessechancy in #316
Use the exported ragged ops for RandomDeletion by @mattdangerw in #332
Random Swap Layer by @aflah02 in #224
Fixes for Random Deletion Layer by @aflah02 in #339
Move cloudbuild to a hidden directory by @mattdangerw in #345
Fix the build by @mattdangerw in #349
Migrating from Datasets to TFDS for GLUE Example by @aflah02 in #340
Move network_tests into keras_nlp/ by @mattdangerw in #344
Stop hardcoding 2.9 by @mattdangerw in #351
Add BERT Large by @abheesht17 in #331
Add normalize_first arg to Transformer Layers by @abheesht17 in #350
Add Small BERT Variants by @abheesht17 in #338
Beam Search: Add Ragged and XLA Support by @abheesht17 in #341
Fix download paths for bert weights by @mattdangerw in #356
Add a BertPreprocessor class by @mattdangerw in #343
Text Generation Functions: Add Benchmark Script by @abheesht17 in #342
Improve readability for encoder/decoder blocks by @mattdangerw in #353
Add GPT-2 Model and its Variants by @abheesht17 in #354
Clean up BERT, RoBERTa doc-strings by @abheesht17 in #359
Create unique string id for each BERT backbone by @jbischof in #361
Use model.fit() for BERT Example by @abheesht17 in #360
Minor Fixes in BertPreprocessor Layer by @abheesht17 in #373
Clone user passed initializers called multiple times by @mattdangerw in #371
Update BERT model file structure by @mattdangerw in #376
Move gpt model code into a directory by @mattdangerw in #379
Move roberta model code into a directory by @mattdangerw in #380
Reorg test directories by @mattdangerw in #384
Add XLM-RoBERTa by @abheesht17 in #372
Add DistilBERT by @abheesht17 in #382
Stop running CI on Windows by @mattdangerw in #386
Fix Bert serialization by @mattdangerw in #385
Improve MacOS support and pin tensorflow version during testing by @mattdangerw in #383
Unify BERT model API in one class by @jbischof in #387
Add from_preset constructor to BertPreprocessor by @jbischof in #390
More robustly test BERT preprocessing by @mattdangerw in #394
Move name and trainable to kwargs by @jbischof in #399
Add backbone as property for task models by @jbischof in #398
Set default name of Bert instance to "backbone" by @jbischof in #397
Fix gpt2 serialization by @mattdangerw in #391
Fix distilbert serialization by @mattdangerw in #392
Fix roberta and xlm-roberta serialization by @mattdangerw in #393
Register the BertPreprocessor as serializable by @mattdangerw in #401
BPE tokenizer by @chenmoneygithub in #389
Change GPT-2's Format to Mirror BERT's by @abheesht17 in #418
Fix bert preprocessing docstring so it is runnable by @mattdangerw in #421
Change RoBERTa and XLM-RoBERTa's Format to Mirror BERT's by @abheesht17 in #417
Update distilbert to mirror recent bert changes by @mattdangerw in #406
Change gpt2 to GPT2 by @sampathweb in #425
Fix byte pair detokenization of 2d arrays by @mattdangerw in #423
Never pass Raggeds to user function when generating text by @mattdangerw in #424
Add XLMRobertaClassifier by @abheesht17 in #422
Add RobertaPreprocessor Layer by @abheesht17 in #419
Update Style Guide for naming of Models and Layers by @sampathweb in #434
Support String Output for BytePairTokenizer by @abheesht17 in #438
Improve our continuous testing for model presets by @mattdangerw in #357
Remove remote files from BPE docstring by @jbischof in #440
Add DistilBertClassifier by @abheesht17 in #437
Remove lingering reference to BertCustom by @mattdangerw in #441
Add XLM-RoBERTa Tokenizer (SPM) by @abheesht17 in #428
Add a disclaimer for use of model checkpoints by @mattdangerw in #430
Add a disclaimer to our README by @mattdangerw in #431
Fix our BERT GLUE example so it runs again by @mattdangerw in #444
Add backbone presets to task classes by @jbischof in #448
Split the Bert tokenizer to a separate class by @mattdangerw in #449
Conditionally import tf text by @mattdangerw in #452
Copy our model disclaimer to the distilbert classifier by @mattdangerw in #453
Fix regex string for BPE by @chenmoneygithub in #458
Fix docstrings and add note to style guide by @jbischof in #464
Allow formatting our docstrings inline by @mattdangerw in #450
Update self.assertEquals with self.assertEqual by @MaximSmolskiy in #466
Document our release process by @mattdangerw in #473
Add RobertaTokenizer by @abheesht17 in #468
Add DistilBertTokenizer by @abheesht17 in #469
Modify XLMRobertaTokenizer to Match BERT by @abheesht17 in #471
Add GPT2Tokenizer by @abheesht17 in #470
Minor fix to git commands by @mattdangerw in #475
Version bump to 0.4.0 by @mattdangerw in #476
Clarify comment on BERT preset testing by @mattdangerw in #477
Fix the nightly build by @mattdangerw in #484
Bump tf and tf-text to 2.11 by @mattdangerw in #490
Consolidate preset testing by @mattdangerw in #480
Allow BertPreprocessor to map labeled datasets by @mattdangerw in #478
Glue eval script by @chenmoneygithub in #445
Update Requirements and Python version in setup.py by @sampathweb in #495
First task-level preset with BertClassifier by @jbischof in #494
Add a helper model to automatically apply preprocessing by @mattdangerw in #346
Add GPT2 Presets by @abheesht17 in #472
fix incorrect flag by @chenmoneygithub in #496
Add instructions on how to update deps of GPU testing by @chenmoneygithub in #499
Add XLM-RoBERTa Presets by @abheesht17 in #482
Add DistilBERT Presets by @abheesht17 in #479
Add RoBERTa Presets by @abheesht17 in #506
Fix nightly builds by @mattdangerw in #522
Remove typo by @jbischof in #515
Fix Model Doc-string Examples by @abheesht17 in #516
Mark format.sh executable again by @mattdangerw in #518
Make BertClassifier operate directly on raw string inputs by @mattdangerw in #485
Fix nightlies take two by @mattdangerw in #525
Use tf.ones for docstring example input by @jbischof in #524
Fix the index order of GLUE script and other bugs by @chenmoneygithub in #517
Standardize on "backbone" naming for BERT by @jbischof in #536
Add dropout to BertClassifier by @chenmoneygithub in #540
Preprocess string lists as a batch of single segments by @mattdangerw in #504
Update model and file names for DistilBert by @ADITYADAS1999 in #541
Rename filenames in models/ to match classnames by @mattdangerw in #548
Split tokenizers into their own file by @mattdangerw in #549
Add a note about our new file naming conventions by @mattdangerw in #553
Add distribution support for GLUE script by @chenmoneygithub in #544
Make out preset ids more consistent by @mattdangerw in #552
Rename DistilBert -> DistilBertBackbone by @mattdangerw in #551
Rename Roberta -> RobertaBackbone (and same for XLM*) by @mattdangerw in #550
Fix link in glue_benchmark README by @mattdangerw in #557
Remove qualification for PRESET_NAMES by @jbischof in #554
Use black[jupyter] to format notebooks by @jbischof in #556
Temporarily drop GPT2 from our init.py by @mattdangerw in #560
Replicate #536 changing GPT2 -> GPT2Backbone by @ADITYADAS1999 in #558
Change Backbone Names by @abheesht17 in #559
Raise a friendly error message for unbatched input by @mattdangerw in #545
Fix Typo in Backbone Doc-strings by @abheesht17 in #565
Fix LinearDecayWithWarmup crash in BERT model. by @reedwm in #564
Update XLMRobertaPreprocessor to mirror recent changes by @mattdangerw in #568
Update RobertaPreprocessor to mirror recent changes by @mattdangerw in #567
Fix minor typo in BertPreprocessor layer by @mattdangerw in #569
Update DistilBertPreprocessor to mirror recent changes by @mattdangerw in #566
Point presets to url containing "v1/" by @chenmoneygithub in #577
Stop testing h5 saved model format, start testing keras_v3 by @mattdangerw in #521
fix the gpu testing by @chenmoneygithub in #581
Make DistilBertClassifier operate directly on raw string inputs by @mattdangerw in #578
Make RobertaClassifier operate directly on raw string inputs by @chenmoneygithub in #579
Export XLM-Roberta Classifier by @mattdangerw in #580
Make XLMRobertaClassifier operate directly on raw string inputs by @mattdangerw in #583
Fix some misc typos for distilbert by @mattdangerw in #584
Report GLUE score and hyperparameter settings by @chenmoneygithub in #585
Remove support for Python 3.7 to align with keras-nightly by @sampathweb in #590
Add DeBERTa v3 Model by @abheesht17 in #435
Add DeBERTa Tokenizer and Preprocessor Classes by @abheesht17 in #589
Add Dropout to *Classifier Doc-strings by @abheesht17 in #595
Rename MLM -> MaskedLM for all library symbols by @mattdangerw in #598
File-level Doc-string Changes for Classifiers and Presets by @abheesht17 in #604
Add DebertaClassifier and DeBERTa Presets by @abheesht17 in #594
Rename Deberta -> DebertaV3 by @mattdangerw in #605
Version bump to 0.4.0.dev0 by @mattdangerw in #609
Add mixed precision support for glue script by @chenmoneygithub in #608
Remove deberta from 0.4 release by @chenmoneygithub in #607
Update README for v0.4 by @jbischof in #588
Remove the dev prefix for final release by @mattdangerw in #613
Rename preset IDs for consistency by @mattdangerw in #612

New Contributors

@jbischof made their first contribution in #258
@ehrencrona made their first contribution in #273
@sampathweb made their first contribution in #425
@MaximSmolskiy made their first contribution in #466
@ADITYADAS1999 made their first contribution in #541
@reedwm made their first contribution in #564

Full Changelog: v0.3.0...v0.4.0

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

v0.4.0

Breaking Changes

Summary

What's Changed

New Contributors

Contributors