Releases · foundation-model-stack/fms-hf-tuning

27 Jun 21:52

tedhtchang

v0.4.0-rc.2

3f05c67

v0.4.0-rc.2 Pre-release

Pre-release

Summary of Changes

Support for LoRA tuning for llama3 and granite (with GPTBigCode) architectures
Various dependencies versions adjustment

What's Changed

remove merge model for lora tuned adapters by @anhuong in #197
Add test coverage by @tedhtchang in #171
Install Acceleration Framework into Training Script by @fabianlim in #157
deps: limit dependency ranges by @anhuong in #54
Delete dependabot.yml by @tedhtchang in #207
add dependabot.yml by @tedhtchang in #208
Fix additional callbacks by @VassilisVassiliadis in #199
Update trl by @alex-jw-brooks in #213
deps: cap transformers at 4.40.2 by @anhuong in #218

Full Changelog: v0.3.0...v0.4.0-rc.2

Contributors

tedhtchang, fabianlim, and 3 other contributors

Assets 2

27 Jun 20:17

tedhtchang

v0.4.0-rc.1

0949699

v0.4.0-rc.1 Pre-release

Pre-release

What's Changed

remove merge model for lora tuned adapters by @anhuong in #197
Add test coverage by @tedhtchang in #171
Install Acceleration Framework into Training Script by @fabianlim in #157
deps: limit dependency ranges by @anhuong in #54
Delete dependabot.yml by @tedhtchang in #207
add dependabot.yml by @tedhtchang in #208
Fix additional callbacks by @VassilisVassiliadis in #199
Update trl by @alex-jw-brooks in #213

Full Changelog: v0.3.0...v0.4.0-rc.1

Contributors

tedhtchang, fabianlim, and 3 other contributors

Assets 2

12 Jun 17:04

aluu317

v0.3.0

fe43108

v0.3.0

Summary of Changes

Switch to multistage dockerfile which greatly reduced the size of the image
Refactor image scripts to remove launch_training and call sft_trainer directly.
- Note that this affects the error codes returned from sft_trainer to user error code 1 and internal error code 203.
- In addition, this affects the logging as parameter parsing logging is moved into sft_trainer which is harder to view.

What's Changed

Switch to multistage dockerfile by @tharapalanivel in #154
refactor: remove launch_training and call sft_trainer directly by @anhuong in #164
docs: consolidate configs, add kfto config by @anhuong in #170
fix: bloom model can't run with flash-attn by @anhuong in #173
Update README.md for Lora modules by @Ssukriti in #174

Full Changelog: v0.2.0...v0.3.0

Contributors

Ssukriti, anhuong, and tharapalanivel

Assets 2

30 May 06:27

jbusche

v0.2.0

3d0c4f3

v0.2.0

Summary of Changes

Adds a new data_formatter_template field to format data while training from a JSON with custom fields. Eliminating the need to do preprocessing and format data to alpaca style. Find details in README
Update evaluation_strategy flag to eval_strategy
Add evaluation data format scripts to use as reference

What's Changed

fix: check if output dir exists by @anhuong in #160
tests for fixing full fine tuning by @Ssukriti in #162
Evaluation Data Format Scripts by @alex-jw-brooks in #115
Refactor tests explicit params by @Ssukriti in #163
update eval_strategy flag used in transformers by @anhuong in #168
remove unused python39 from dockerfile by @jbusche in #167
Add formatting function alpaca by @Ssukriti in #161

Pip package: pip install fms-hf-tuning==0.2.0

Full Changelog: v0.1.0...v0.2.0

Contributors

alex-jw-brooks, Ssukriti, and 2 other contributors

Assets 2

29 May 16:42

anhuong

v0.2.0-rc.1

3d0c4f3

v0.2.0-rc.1 Pre-release

Pre-release

Add formatting function alpaca (#161)

* utility functions to format datasets using template

Signed-off-by: Sukriti-Sharma4 <sukriti.sharma4@ibm.com>

* add tests and formatter as arg

Signed-off-by: Sukriti-Sharma4 <sukriti.sharma4@ibm.com>

* update tests to use template to avoid warnings

Signed-off-by: Sukriti-Sharma4 <sukriti.sharma4@ibm.com>

* update README and tests

Signed-off-by: Sukriti-Sharma4 <sukriti.sharma4@ibm.com>

* fix:formatter

Signed-off-by: Sukriti-Sharma4 <sukriti.sharma4@ibm.com>

* Update README.md

Signed-off-by: Sukriti Sharma <Ssukriti@users.noreply.github.com>

* fix imports

Signed-off-by: Sukriti-Sharma4 <sukriti.sharma4@ibm.com>

* fix pylint

Signed-off-by: Sukriti-Sharma4 <sukriti.sharma4@ibm.com>

* fix tests

Signed-off-by: Sukriti-Sharma4 <sukriti.sharma4@ibm.com>

* address review comments- function names

Signed-off-by: Sukriti-Sharma4 <sukriti.sharma4@ibm.com>

* formatting fix

Signed-off-by: Sukriti-Sharma4 <sukriti.sharma4@ibm.com>

* update error message

Signed-off-by: Sukriti-Sharma4 <sukriti.sharma4@ibm.com>

* restrict JSON fields templates

Signed-off-by: Sukriti-Sharma4 <sukriti.sharma4@ibm.com>

---------

Signed-off-by: Sukriti-Sharma4 <sukriti.sharma4@ibm.com>
Signed-off-by: Sukriti Sharma <Ssukriti@users.noreply.github.com>

Assets 2

18 May 00:01

jbusche

v0.1.0

d515f78

v0.1.0 - First release

Summary of Changes

Supports and validated tuning technique: full fine tuning using single-GPU and multi-GPU
- Multi-GPU training using HuggingFace accelerate library, focused on FSDP
Experimental tuning techniques:
- Single GPU Prompt tuning
- Single GPU LoRA tuning
Scripts to allow local inference and evaluation of tuned models
Build scripts for containerization of library
Initial trainer controller framework for controlling the trainer loop using user-defined rules and metrics

Pip package: pip install fms-hf-tuning==0.1.0

What's Changed

Init by @raghukiran1224 in #1
allows disable flash attn and torch dtype param by @Ssukriti in #2
First refactor train by @Ssukriti in #3
fix : the way args are passed by @Ssukriti in #10
fix full param tuning by @lchu-ibm in #14
fix import of aim_loader by @anhuong in #13
fix: set model max length to either passed in or tokenizer value by @anhuong in #17
fix: do not set model max length when loading model by @anhuong in #21
add EOS token to dataset by @Ssukriti in #15
Local inference by @alex-jw-brooks in #27
feat: add validation dataset to train by @anhuong in #26
feat: support str in target_modules for LoraConfig by @VassilisVassiliadis in #39
Add formatting tools by @hickeyma in #31
Enable code formatting by @hickeyma in #40
Enable daily dependabot updates by @hickeyma in #41
Add file logger callback & export train loss json file by @alex-jw-brooks in #22
Merge models by @alex-jw-brooks in #32
Local inference merged models by @alex-jw-brooks in #43
feat: track validation loss in logs file by @anhuong in #51
Add linting capability by @hickeyma in #52
Add PR/Issue templates by @tedhtchang in #65
Add sample unit tests by @tedhtchang in #61
Initial commit for trainer image by @tharapalanivel in #69
Adding copyright notices by @tharapalanivel in #77
Enable pylint in the github workflow by @tedhtchang in #63
Bump aim from 3.17.5 to 3.18.1 by @dependabot in #42
Add Contributing file by @jbusche in #58
docs: lora and getting modules list by @anhuong in #46
Allow SFT_TRAINER_CONFIG_JSON_ENV_VAR to be encoded json string by @kellyaa in #82
Document lint by @tedhtchang in #84
Let Huggingface Properly Initialize Arguments, and Fix FSDP-LORA Checkpoint-Saves and Resumption by @fabianlim in #53
Unit tests by @tharapalanivel in #83
Update CONTRIBUTING.md by @Ssukriti in #86
Update input args to max_seq_length and training_data_path by @anhuong in #94
feat: move to accelerate launch for distributed training by @kmehant in #92
Update README.md by @Ssukriti in #95
Modify copyright notice by @tharapalanivel in #96
Switches dependencies from txt file to toml file by @jbusche in #68
fix: use attn_implementation="flash_attention_2" by @kmehant in #101
fix: not passing PEFT argument should default to full parameter finetuning by @kmehant in #100
feat: update launch training with accelerate for multi-gpu by @anhuong in #98
Setting default values in training job config by @tharapalanivel in #104
add refactored build utils into docker image by @anhuong in #108
feat: combine train and eval loss into one file by @anhuong in #109
docs: add note on ephemeral storage by @anhuong in #106
Move accelerate launch args parsing by @tharapalanivel in #107
Docs improvements by @Ssukriti in #111
feat: add env var SET_NUM_PROCESSES_TO_NUM_GPUS by @anhuong in #110
feat: Trainer controller framework by @seshapad in #45
Copying logs file by @tharapalanivel in #113
Fix copying over logs by @tharapalanivel in #114
Add eval script by @alex-jw-brooks in #102
Lint tests by @tharapalanivel in #112
Move sklearn to optional, install optionals for linting by @alex-jw-brooks in #117
Build Wheel Action by @jbusche in #105
rstrip eos in evaluation by @alex-jw-brooks in #121
Fix eos token suffix removal by @alex-jw-brooks in #125
Make use of instruction field optional by @alex-jw-brooks in #123
Deprecating the requirements.txt for dependencies management by @tedhtchang in #116
Add unit tests for various edge cases by @alex-jw-brooks in #97
fix typo in build gha by @jbusche in #138
Install whl in Dockerfile by @tedhtchang in #126
feat: add flash attn to inference and eval scripts by @anhuong in #132
OS update in dockerfile by @jbusche in #127
fix: ignore the build output and auto-generated files by @HarikrishnanBalagopal in #140
Propose ADR for Training Acceleration by @fabianlim in #119
feat: new format for the controller metrics and operations by @HarikrishnanBalagopal in #130
adr: Format change to the trainer controller configuration by @seshapad in #128
Generic tracker API and implementation of Aimstack tracker by @dushyantbehl in #89
fix: Allow makefile to run test independent of fmt/lint by @dushyantbehl in #145
feat: Trainer state as a trainer controller metric by @seshapad in #150
Bump aim from 3.18.1 to 3.19.0 by @dependabot in #93
fix: launch_training.py arguments with new tracker api by @dushyantbehl in #153
feat: Exposed the evaluation metrics for rules within trainer controller by @seshapad in #146
Comment out aim in dockerfile by @jbusche in #155
fix: replace eval with a safer alternative by @HarikrishnanBalagopal in #147
doc...

Contributors

dushyantbehl, tedhtchang, and 15 other contributors

Assets 2

16 May 20:08

anhuong

v0.1.0-rc.1

d515f78

v0.1.0-rc.1 Pre-release

Pre-release

What's Changed

fix: replace eval with a safer alternative by @HarikrishnanBalagopal in #147
docs: ADR for moving from eval to simpleeval for evaluating trainer controller rules by @HarikrishnanBalagopal in #151
Add exception catching / writing to termination log by @kellyaa in #149
fix: merging of model for multi-gpu by @anhuong in #158
add .complete file to output dir when done by @kellyaa in #159

Full Changelog: v0.0.2rc2...v0.1.0-rc.1

Contributors

kellyaa, anhuong, and HarikrishnanBalagopal

Assets 2

13 May 21:03

jbusche

v0.0.2rc2

40fd75c

v0.0.2rc.2 Pre-release

Pre-release

What's Changed

fix typo in build gha by @jbusche in #138
Install whl in Dockerfile by @tedhtchang in #126
feat: add flash attn to inference and eval scripts by @anhuong in #132
OS update in dockerfile by @jbusche in #127
fix: ignore the build output and auto-generated files by @HarikrishnanBalagopal in #140
Propose ADR for Training Acceleration by @fabianlim in #119
feat: new format for the controller metrics and operations by @HarikrishnanBalagopal in #130
adr: Format change to the trainer controller configuration by @seshapad in #128
Generic tracker API and implementation of Aimstack tracker by @dushyantbehl in #89
fix: Allow makefile to run test independent of fmt/lint by @dushyantbehl in #145
feat: Trainer state as a trainer controller metric by @seshapad in #150
Bump aim from 3.18.1 to 3.19.0 by @dependabot in #93
fix: launch_training.py arguments with new tracker api by @dushyantbehl in #153
feat: Exposed the evaluation metrics for rules within trainer controller by @seshapad in #146
Comment out aim in dockerfile by @jbusche in #155

New Contributors

@HarikrishnanBalagopal made their first contribution in #140
@dushyantbehl made their first contribution in #89

Full Changelog: v0.0.2rc1...v0.0.2rc2

Contributors

dushyantbehl, tedhtchang, and 6 other contributors

Assets 2

24 Apr 22:57

jbusche

v0.0.2rc1

8548a6d

v0.0.2rc1 Pre-release

Pre-release

What's Changed

Init by @raghukiran1224 in #1
allows disable flash attn and torch dtype param by @Ssukriti in #2
First refactor train by @Ssukriti in #3
fix : the way args are passed by @Ssukriti in #10
fix full param tuning by @lchu-ibm in #14
fix import of aim_loader by @anhuong in #13
fix: set model max length to either passed in or tokenizer value by @anhuong in #17
fix: do not set model max length when loading model by @anhuong in #21
add EOS token to dataset by @Ssukriti in #15
Local inference by @alex-jw-brooks in #27
feat: add validation dataset to train by @anhuong in #26
feat: support str in target_modules for LoraConfig by @VassilisVassiliadis in #39
Add formatting tools by @hickeyma in #31
Enable code formatting by @hickeyma in #40
Enable daily dependabot updates by @hickeyma in #41
Add file logger callback & export train loss json file by @alex-jw-brooks in #22
Merge models by @alex-jw-brooks in #32
Local inference merged models by @alex-jw-brooks in #43
feat: track validation loss in logs file by @anhuong in #51
Add linting capability by @hickeyma in #52
Add PR/Issue templates by @tedhtchang in #65
Add sample unit tests by @tedhtchang in #61
Initial commit for trainer image by @tharapalanivel in #69
Adding copyright notices by @tharapalanivel in #77
Enable pylint in the github workflow by @tedhtchang in #63
Bump aim from 3.17.5 to 3.18.1 by @dependabot in #42
Add Contributing file by @jbusche in #58
docs: lora and getting modules list by @anhuong in #46
Allow SFT_TRAINER_CONFIG_JSON_ENV_VAR to be encoded json string by @kellyaa in #82
Document lint by @tedhtchang in #84
Let Huggingface Properly Initialize Arguments, and Fix FSDP-LORA Checkpoint-Saves and Resumption by @fabianlim in #53
Unit tests by @tharapalanivel in #83
Update CONTRIBUTING.md by @Ssukriti in #86
Update input args to max_seq_length and training_data_path by @anhuong in #94
feat: move to accelerate launch for distributed training by @kmehant in #92
Update README.md by @Ssukriti in #95
Modify copyright notice by @tharapalanivel in #96
Switches dependencies from txt file to toml file by @jbusche in #68
fix: use attn_implementation="flash_attention_2" by @kmehant in #101
fix: not passing PEFT argument should default to full parameter finetuning by @kmehant in #100
feat: update launch training with accelerate for multi-gpu by @anhuong in #98
Setting default values in training job config by @tharapalanivel in #104
add refactored build utils into docker image by @anhuong in #108
feat: combine train and eval loss into one file by @anhuong in #109
docs: add note on ephemeral storage by @anhuong in #106
Move accelerate launch args parsing by @tharapalanivel in #107
Docs improvements by @Ssukriti in #111
feat: add env var SET_NUM_PROCESSES_TO_NUM_GPUS by @anhuong in #110
feat: Trainer controller framework by @seshapad in #45
Copying logs file by @tharapalanivel in #113
Fix copying over logs by @tharapalanivel in #114
Add eval script by @alex-jw-brooks in #102
Lint tests by @tharapalanivel in #112
Move sklearn to optional, install optionals for linting by @alex-jw-brooks in #117
Build Wheel Action by @jbusche in #105
rstrip eos in evaluation by @alex-jw-brooks in #121
Fix eos token suffix removal by @alex-jw-brooks in #125
Make use of instruction field optional by @alex-jw-brooks in #123
Deprecating the requirements.txt for dependencies management by @tedhtchang in #116
Add unit tests for various edge cases by @alex-jw-brooks in #97

New Contributors

@raghukiran1224 made their first contribution in #1
@Ssukriti made their first contribution in #2
@lchu-ibm made their first contribution in #14
@anhuong made their first contribution in #13
@alex-jw-brooks made their first contribution in #27
@VassilisVassiliadis made their first contribution in #39
@hickeyma made their first contribution in #31
@tedhtchang made their first contribution in #65
@tharapalanivel made their first contribution in #69
@dependabot made their first contribution in #42
@jbusche made their first contribution in #58
@kellyaa made their first contribution in #82
@fabianlim made their first contribution in #53
@kmehant made their first contribution in #92
@seshapad made their first contribution in #45

Full Changelog: https://github.com/foundation-model-stack/fms-hf-tuning/commits/v.0.0.2rc1

Contributors

tedhtchang, raghukiran1224, and 13 other contributors

Assets 2

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Summary of Changes

What's Changed

Contributors

What's Changed

Contributors

Summary of Changes

What's Changed

Contributors

Summary of Changes

What's Changed

Contributors

Summary of Changes

What's Changed

Contributors

What's Changed

Contributors

What's Changed

New Contributors

Contributors

What's Changed

New Contributors

Contributors

Releases: foundation-model-stack/fms-hf-tuning

v0.4.0-rc.2

Summary of Changes

What's Changed

Contributors

v0.4.0-rc.1

What's Changed

Contributors

v0.3.0

Summary of Changes

What's Changed

Contributors

v0.2.0

Summary of Changes

What's Changed

Contributors

v0.2.0-rc.1

v0.1.0 - First release

Summary of Changes

What's Changed

Contributors

v0.1.0-rc.1

What's Changed

Contributors

v0.0.2rc.2

What's Changed

New Contributors

Contributors

v0.0.2rc1

What's Changed

New Contributors

Contributors