Make kv cache pos buffer name more specific #7635
Merged
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Summary
Doing this now just in case later on there appears another module in TorchTune that:
In this situation we prevent the pte size from blowing up.
kv_cache_pos
is specific enough that it probably will never appear as the name for another buffer.Test plan
python -m examples.models.llama.export_llama --model llama3_2_vision --checkpoint /tmp/Llama-3.2-11B-Vision-Instruct/original/consolidated.pth --params examples/models/llama3_2_vision/text_decoder/params/demo_config.json --metadata '{"append_eos_to_prompt": 0, "get_bos_id":128000, "get_eos_ids":[128009, 128001], "get_n_bos": 0, "get_n_eos": 0}' --output_name="llama3_2_vision.pte" -d fp32 --verbose --max_seq_length 64 -kv
python -m examples.models.llama3_2_vision.runner.native --model llama3_2_vision --pte llama3_2_vision.pte --tokenizer /tmp/Llama-3.2-11B-Vision-Instruct/original/tokenizer.model --prompt "Who is the founder of Meta?" --params examples/models/llama3_2_vision/text_decoder/params/demo_config.json --max_len 64 --temperature 0 -kv
Note: I'll make a ci test soon 😅