Skip to content

Commit

Permalink
.
Browse files Browse the repository at this point in the history
  • Loading branch information
NouamaneTazi committed May 13, 2024
1 parent 4280ac3 commit d3d991e
Showing 1 changed file with 14 additions and 0 deletions.
14 changes: 14 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -63,6 +63,20 @@ torchrun --nproc_per_node=1 run_generate.py --ckpt-path checkpoints/10/ --tp 1 -
# We could set a larger TP for faster generation, and a larger PP in case of very large models.
```

### Custom examples
You can find more examples in the [`/examples`](/examples) directory:
<!-- Make a table of the examples we support -->
| Example | Description |
| --- | --- |
| `custom-dataloader` | Plug a custom dataloader to nanotron |
| `datatrove` | Use the datatrove library to load data |
| `doremi` | Use DoReMi to speed up training |
| `mamba` | Train an example Mamba model |
| `moe` | Train an example Mixture-of-Experts (MoE) model |
| `mup` | Use spectral µTransfer to scale up your model |
We're working on adding more examples soon! Feel free to add a PR to add your own example. 🚀


## Features
We currently support the following features:
- [x] 3D parallelism (DP+TP+PP)
Expand Down

0 comments on commit d3d991e

Please sign in to comment.