Add lit-llama benchmarks (logits, autoregressive generation, lora fine tuning) #1730

ezyang · 2023-06-12T21:32:19Z

Signed-off-by: Edward Z. Yang ezyang@meta.com

…e tuning) Signed-off-by: Edward Z. Yang <ezyang@meta.com>

msaroufim

Some minor nits otherwise thanks! Let's see how long CI will take now lol

msaroufim · 2023-06-12T21:41:41Z

torchbenchmark/models/lit_llama_lora/__init__.py

+    def train(self):
+        logits = self.model(*self.example_inputs)
+        logits.sum().backward()
+        # meh this sucks


xd, this might be a good dataset https://huggingface.co/datasets/OpenAssistant/oasst1

Even finetuning on two examples of questions you make up might be not bad as a sanity

Will fix this later, I think. Not needed for dynamo benchmarks.

msaroufim · 2023-06-12T21:43:38Z

torchbenchmark/models/lit_llama_generate/__init__.py

+    def eval(self):
+        self.model.eval()
+        with torch.no_grad():
+            y = self.model(*self.example_inputs)


do you mind printing the input prompt and the output, will be nice to do vibe checks later

Hmm, but I don't want to print it here, because then the detokenization would also count as part of the benchmark?

msaroufim · 2023-06-12T21:44:32Z

torchbenchmark/models/lit_llama_generate/__init__.py

+        self.model = GenerationWrapper(self.model)
+        tokenizer = Tokenizer(os.path.join(LIT_LLAMA_PATH, "checkpoints/lit-llama/tokenizer.model"))
+        # max_new_tokens matches lit-llama/generate.py
+        self.example_inputs = (tokenizer.encode("The meaning of life is", bos=True, eos=False, device=device), 50)


is 50 the max number of tokens to generate?

xuzhao9

LGTM. How large is the checkpoint file and is there any rules on the accessing frequency? If we download it too frequently (every CI workflow and every nightly testing workflow), the server might ban our access.

msaroufim · 2023-06-12T21:51:20Z

@xuzhao9 this will be a common workflow for LLM work (SAM is similar today), it might make sense to cache these files in a github artifact or an S3 bucket if github has data size limits

Signed-off-by: Edward Z. Yang <ezyang@meta.com>

ezyang · 2023-06-13T13:40:00Z

Do we have any precedent for hosting it in S3? I am happy to set it up if there is some example of doing it.

Signed-off-by: Edward Z. Yang <ezyang@meta.com>

xuzhao9 · 2023-06-13T16:14:15Z

Do we have any precedent for hosting it in S3? I am happy to set it up if there is some example of doing it.

I am not sure if it requires legal review for that.
The closest example I found is that Detectron2 models host their checkpoints at https://dl.fbaipublicfiles.com: https://github.com/pytorch/benchmark/blob/main/torchbenchmark/util/framework/detectron2/__init__.py#L12

Signed-off-by: Edward Z. Yang <ezyang@meta.com>

ezyang · 2023-06-16T15:17:53Z

So, it seems like we can only run this benchmark on the A100s anyway, so I'm going to disable the A10G configuratoin

Signed-off-by: Edward Z. Yang <ezyang@meta.com>

xuzhao9

LGTM, see minor inline comments

torchbenchmark/models/lit_llama/__init__.py

xuzhao9 · 2023-07-05T19:48:47Z

torchbenchmark/models/lit_llama/__init__.py

+class Model(BenchmarkModel):
+    task = NLP.LANGUAGE_MODELING
+    DEFAULT_EVAL_BSIZE = 1
+    DEFAULT_TRAIN_BSIZE = 32


Curious why the default train batch size is 32?

I think I should just delete this, it's meaningless, you can't train 7B without some sort of distribution haha

Signed-off-by: Edward Z. Yang <ezyang@meta.com>

facebook-github-bot · 2023-07-07T16:00:37Z

@ezyang has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

ezyang · 2023-07-07T16:01:10Z

oh thank god, pr-test is finally passing

Signed-off-by: Edward Z. Yang <ezyang@meta.com>

facebook-github-bot · 2023-07-07T16:48:14Z

@ezyang has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

facebook-github-bot · 2023-07-07T18:38:33Z

@ezyang merged this pull request in 02ff72b.

Add lit-llama benchmarks (logits, autoregressive generation, lora fin…

5de0d88

…e tuning) Signed-off-by: Edward Z. Yang <ezyang@meta.com>

facebook-github-bot added the cla signed label Jun 12, 2023

msaroufim approved these changes Jun 12, 2023

View reviewed changes

xuzhao9 approved these changes Jun 12, 2023

View reviewed changes

Fix install scripts

40ce150

Signed-off-by: Edward Z. Yang <ezyang@meta.com>

metadata.yaml

0ac43ab

Signed-off-by: Edward Z. Yang <ezyang@meta.com>

poke ci some more

8a179b8

Signed-off-by: Edward Z. Yang <ezyang@meta.com>

ezyang added 3 commits June 20, 2023 05:53

error out more

25fc7a1

Signed-off-by: Edward Z. Yang <ezyang@meta.com>

Merge remote-tracking branch 'origin/main' into lit-llama

d9d2f50

Unconditionally install lightning

b9f281b

Signed-off-by: Edward Z. Yang <ezyang@meta.com>

xuzhao9 approved these changes Jul 5, 2023

View reviewed changes

Disable more tests

b487da7

Signed-off-by: Edward Z. Yang <ezyang@meta.com>

CR comments

1b27531

Signed-off-by: Edward Z. Yang <ezyang@meta.com>

facebook-github-bot closed this in 02ff72b Jul 7, 2023

facebook-github-bot added the Merged label Jul 7, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add lit-llama benchmarks (logits, autoregressive generation, lora fine tuning) #1730

Add lit-llama benchmarks (logits, autoregressive generation, lora fine tuning) #1730

ezyang commented Jun 12, 2023

msaroufim left a comment

msaroufim Jun 12, 2023

ezyang Jul 7, 2023

msaroufim Jun 12, 2023

ezyang Jul 7, 2023

msaroufim Jun 12, 2023

ezyang Jun 13, 2023

xuzhao9 left a comment

msaroufim commented Jun 12, 2023 •

edited

Loading

ezyang commented Jun 13, 2023

xuzhao9 commented Jun 13, 2023

ezyang commented Jun 16, 2023

xuzhao9 left a comment

xuzhao9 Jul 5, 2023

ezyang Jul 5, 2023

facebook-github-bot commented Jul 7, 2023

ezyang commented Jul 7, 2023

facebook-github-bot commented Jul 7, 2023

facebook-github-bot commented Jul 7, 2023

Add lit-llama benchmarks (logits, autoregressive generation, lora fine tuning) #1730

Add lit-llama benchmarks (logits, autoregressive generation, lora fine tuning) #1730

Conversation

ezyang commented Jun 12, 2023

msaroufim left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

xuzhao9 left a comment

Choose a reason for hiding this comment

msaroufim commented Jun 12, 2023 • edited Loading

ezyang commented Jun 13, 2023

xuzhao9 commented Jun 13, 2023

ezyang commented Jun 16, 2023

xuzhao9 left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

facebook-github-bot commented Jul 7, 2023

ezyang commented Jul 7, 2023

facebook-github-bot commented Jul 7, 2023

facebook-github-bot commented Jul 7, 2023

msaroufim commented Jun 12, 2023 •

edited

Loading