Llama2 handler #425

margaretqian · 2023-07-27T18:25:19Z

No description provided.

nik-mosaic · 2023-07-30T19:53:55Z

examples/inference-deployments/llama2/llama2_handler.py

+
+import deepspeed
+import torch
+from transformers import (AutoModelForCausalLM, AutoTokenizer,


I don't see AutoModel or AutoTokenizer used anywhere.

asfandyarq · 2023-07-31T18:08:10Z

examples/inference-deployments/llama2/llama2_handler.py

+
+    INPUT_KEY = 'input'
+    PARAMETERS_KEY = 'parameters'
+    MODEL_DTYPE = 'fp16'


asfandyarq · 2023-07-31T18:13:50Z

examples/inference-deployments/llama2/llama2_handler.py

+    def __init__(
+        self,
+        model_name_or_path: str,
+    ):


Could we add a max_output_tokens parameter here, or in the go server? We'd then clip user max_length values in requests to this.

This is a slow model and we'll probably need to limit it to globally to 128 tokens. Otherwise someone could send a super long request and tie up the entire server for >30 sec.

margaretqian added 7 commits July 27, 2023 11:21

wip

dd4957d

llama2 yaml

f8fc3c7

rename

60b5c05

cleanup

8a56a56

loading works

eaa5099

loading works, debugging request

c134cb3

works!

61accf8

margaretqian requested review from asfandyarq, nik-mosaic and dskhudia July 29, 2023 07:28

margaretqian changed the title ~~[WIP] Llama2 handler~~ Llama2 handler Jul 29, 2023

margaretqian marked this pull request as ready for review July 29, 2023 07:29

nik-mosaic reviewed Jul 30, 2023

View reviewed changes

nik-mosaic approved these changes Jul 30, 2023

View reviewed changes

asfandyarq approved these changes Jul 31, 2023

View reviewed changes

margaretqian closed this Sep 21, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Llama2 handler #425

Llama2 handler #425

margaretqian commented Jul 27, 2023

nik-mosaic Jul 30, 2023

asfandyarq Jul 31, 2023

asfandyarq Jul 31, 2023

Llama2 handler #425

Llama2 handler #425

Conversation

margaretqian commented Jul 27, 2023

nik-mosaic Jul 30, 2023

Choose a reason for hiding this comment

asfandyarq Jul 31, 2023

Choose a reason for hiding this comment

asfandyarq Jul 31, 2023

Choose a reason for hiding this comment