Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Refactor utils #1 #161

Merged
merged 31 commits into from
Dec 30, 2024
Merged

Refactor utils #1 #161

merged 31 commits into from
Dec 30, 2024

Conversation

Blaizzy
Copy link
Owner

@Blaizzy Blaizzy commented Dec 26, 2024

This PR is the first of many to clean up the code base and simplify it.

It will make it easier to add new features such as image/video feature caching, KV cache quant and more.

⚠️ This is a significant change, please feel free to open any issues if this PR breaks your workflow or model inference.

Closes #160
Closes #94
Closes #135
Closes #144

@Blaizzy Blaizzy merged commit 78920b0 into main Dec 30, 2024
1 check passed
@jrp2014
Copy link

jrp2014 commented Dec 30, 2024

Running my harness, on your latest release:

# from mlx import version as mlx_version
from mlx_vlm import load, generate, version as vlm_version
from mlx_vlm.prompt_utils import apply_chat_template
from mlx_vlm.utils import load_config
import subprocess
import time
import psutil

# print("mlx version:", mlx_version())
print("mlx-vlm version:", vlm_version)

output = subprocess.check_output(
    ["/opt/homebrew/Caskroom/miniconda/base/envs/mlx/bin/huggingface-cli", "scan-cache"]
)
lines = output.decode("utf-8").split("\n")[2:-4]

for line in lines:
    print(80 * "v")
    model_path = line.split()[0]
    print("\033[1mRunning", model_path, "\033[0m")

    process = psutil.Process()
    mem_before = process.memory_info().rss

    try:
        # Load the model
        model, tokenizer = load(model_path)
        config = load_config(model_path)
    except Exception as e:
        print(f"Failed to load model at {model_path}: {e}")
        continue

    # Prepare input
    image = ["http://images.cocodataset.org/val2017/000000039769.jpg"]
    prompt = "Describe this image."

    # Apply chat template
    formatted_prompt = apply_chat_template(
        tokenizer, config, prompt, num_images=len(image)
    )

    # Generate output
    try:
        start_time = time.time()
        output = generate(model, tokenizer, image, formatted_prompt, max_tokens=500, verbose=True)
        end_time = time.time()
        print(output)
    except Exception as e:
        print(f"Failed to generate output for model at {model_path}: {e}")
        continue

    mem_after = process.memory_info().rss
    print(f"Output generated in {end_time - start_time:.2f}s")
    print(f"Memory used: {(mem_after - mem_before) / (1024 * 1024 * 1024):.2f} GB")

    print(80 * "^", end="\n\n")

I get:

python check_models.py
mlx-vlm version: <module 'mlx_vlm.version' from '/opt/homebrew/Caskroom/miniconda/base/envs/mlx/lib/python3.12/site-packages/mlx_vlm/version.py'>
vvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvv
Running HuggingFaceTB/SmolVLM-Instruct 
Fetching 12 files: 100%|█████████████████████████████████████████████████████████████| 12/12 [00:00<00:00, 26324.08it/s]
Failed to load model at HuggingFaceTB/SmolVLM-Instruct: Unsupported model type: idefics3_vision
vvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvv
Running OpenGVLab/InternVL2_5-8B 
Fetching 21 files: 100%|█████████████████████████████████████████████████████████████| 21/21 [00:00<00:00, 13367.79it/s]
The repository for /Users/jrp/.cache/huggingface/hub/models--OpenGVLab--InternVL2_5-8B/snapshots/d64b85a1392275381ddbb7525db05e587303d59e contains custom code which must be executed to correctly load the model. You can inspect the repository content at https://hf.co//Users/jrp/.cache/huggingface/hub/models--OpenGVLab--InternVL2_5-8B/snapshots/d64b85a1392275381ddbb7525db05e587303d59e.
You can avoid this prompt in future by passing the argument `trust_remote_code=True`.

Do you wish to run the custom code? [y/N] ERROR:root:Model type internvl_chat not supported.
Failed to load model at OpenGVLab/InternVL2_5-8B: Model type internvl_chat not supported.
vvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvv
Running cognitivecomputations/dolphin-2.9.2-qwen2-72b 
Fetching 40 files: 100%|██████████████████████████████████████████████████████████████| 40/40 [00:00<00:00, 5521.73it/s]
ERROR:root:Model type qwen2 not supported.
Failed to load model at cognitivecomputations/dolphin-2.9.2-qwen2-72b: Model type qwen2 not supported.
vvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvv
Running distilbert/distilbert-base-uncased-finetuned-sst-2-english 
Fetching 10 files: 100%|██████████████████████████████████████████████████████████████| 10/10 [00:00<00:00, 5794.04it/s]
ERROR:root:Model type distilbert not supported.
Failed to load model at distilbert/distilbert-base-uncased-finetuned-sst-2-english: Model type distilbert not supported.
vvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvv
Running google/siglip-so400m-patch14-384 
Fetching 6 files: 100%|████████████████████████████████████████████████████████████████| 6/6 [00:00<00:00, 13632.62it/s]
ERROR:root:Model type siglip not supported.
Failed to load model at google/siglip-so400m-patch14-384: Model type siglip not supported.
vvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvv
Running meta-llama/Llama-3.2-11B-Vision-Instruct 
Fetching 15 files: 100%|█████████████████████████████████████████████████████████████| 15/15 [00:00<00:00, 16008.79it/s]
Fetching 15 files: 100%|█████████████████████████████████████████████████████████████| 15/15 [00:00<00:00, 22090.79it/s]
==========
Image: <|begin_of_text|><|start_header_id|>user<|end_header_id|>

Describe this image.<|image|><|eot_id|><|start_header_id|>assistant<|end_header_id|>

 

Prompt: ['http://images.cocodataset.org/val2017/000000039769.jpg']
Failed to generate output for model at meta-llama/Llama-3.2-11B-Vision-Instruct: TextEncodeInput must be Union[TextInputSequence, Tuple[InputSequence, InputSequence]]
vvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvv
Running microsoft/Phi-3.5-mini-instruct 
Fetching 13 files: 100%|██████████████████████████████████████████████████████████████| 13/13 [00:00<00:00, 2792.62it/s]
ERROR:root:Model type phi3 not supported.
Failed to load model at microsoft/Phi-3.5-mini-instruct: Model type phi3 not supported.
vvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvv
Running microsoft/Phi-3.5-vision-instruct 
Fetching 14 files: 100%|█████████████████████████████████████████████████████████████| 14/14 [00:00<00:00, 16293.08it/s]
The repository for /Users/jrp/.cache/huggingface/hub/models--microsoft--Phi-3.5-vision-instruct/snapshots/4a0d683eba9f1d0cbfb6151705d1ee73c25a80ca contains custom code which must be executed to correctly load the model. You can inspect the repository content at https://hf.co//Users/jrp/.cache/huggingface/hub/models--microsoft--Phi-3.5-vision-instruct/snapshots/4a0d683eba9f1d0cbfb6151705d1ee73c25a80ca.
You can avoid this prompt in future by passing the argument `trust_remote_code=True`.

Do you wish to run the custom code? [y/N] y
The repository for /Users/jrp/.cache/huggingface/hub/models--microsoft--Phi-3.5-vision-instruct/snapshots/4a0d683eba9f1d0cbfb6151705d1ee73c25a80ca contains custom code which must be executed to correctly load the model. You can inspect the repository content at https://hf.co//Users/jrp/.cache/huggingface/hub/models--microsoft--Phi-3.5-vision-instruct/snapshots/4a0d683eba9f1d0cbfb6151705d1ee73c25a80ca.
You can avoid this prompt in future by passing the argument `trust_remote_code=True`.

Do you wish to run the custom code? [y/N] y
/opt/homebrew/Caskroom/miniconda/base/envs/mlx/lib/python3.12/site-packages/transformers/models/auto/image_processing_auto.py:524: FutureWarning: The image_processor_class argument is deprecated and will be removed in v4.42. Please use `slow_image_processor_class`, or `fast_image_processor_class` instead
  warnings.warn(
Fetching 14 files: 100%|█████████████████████████████████████████████████████████████| 14/14 [00:00<00:00, 11255.56it/s]
The repository for /Users/jrp/.cache/huggingface/hub/models--microsoft--Phi-3.5-vision-instruct/snapshots/4a0d683eba9f1d0cbfb6151705d1ee73c25a80ca contains custom code which must be executed to correctly load the model. You can inspect the repository content at https://hf.co//Users/jrp/.cache/huggingface/hub/models--microsoft--Phi-3.5-vision-instruct/snapshots/4a0d683eba9f1d0cbfb6151705d1ee73c25a80ca.
You can avoid this prompt in future by passing the argument `trust_remote_code=True`.

Do you wish to run the custom code? [y/N] y
==========
Image: <|user|>
<|image_1|>Describe this image.<|end|>
<|assistant|>
 

Prompt: ['http://images.cocodataset.org/val2017/000000039769.jpg']
Failed to generate output for model at microsoft/Phi-3.5-vision-instruct: TextEncodeInput must be Union[TextInputSequence, Tuple[InputSequence, InputSequence]]

I think that you default trust_remote_code to True (which I think should be more cautious, but it doesn't seem to prevent the prompts).

Biut more fundamentally, TextEncodeInput must be Union[TextInputSequence, Tuple[InputSequence, InputSequence]] breaks models that previously worked.

@sachinraja13
Copy link

Facing the same problem above with 0.1.7

@Blaizzy
Copy link
Owner Author

Blaizzy commented Dec 30, 2024

Hey @jrp2014 and @sachinraja13

Here are the changes that you need to make to your script:

  1. load_model and load_config now take kwargs that need to include trusted_remote_code just like in transformers. This is because you might have other configurations to set.
  2. generate, stream_generate and generate step arguments are slightly changed, please check it here.

Script ✅

# from mlx import version as mlx_version
from mlx_vlm import load, generate, __version__ as vlm_version
from mlx_vlm.prompt_utils import apply_chat_template
from mlx_vlm.utils import load_config
import mlx.core as mx
import subprocess
import time
import psutil

# print("mlx version:", mlx_version())
print("mlx-vlm version:", vlm_version)

for model_path in [
    "mlx-community/nanoLLaVA-1.5-4bit",
    "mlx-community/Phi-3.5-vision-instruct-4bit",
    "mlx-community/Qwen2-VL-2B-Instruct-4bit",
    "HuggingFaceTB/SmolVLM-Instruct",
    "mlx-community/Llama-3.2-11B-Vision-Instruct-4bit",
    "mlx-community/idefics2-8b-4bit"
]:
    print(80 * "v")
    print("\033[1mRunning", model_path, "\033[0m")

    process = psutil.Process()
    mem_before = process.memory_info().rss

    try:
        # Load the model
        trust_remote_code = True
        model, tokenizer = load(model_path, trust_remote_code=trust_remote_code)
        config = load_config(model_path, trust_remote_code=trust_remote_code)
    except Exception as e:
        print(f"Failed to load model at {model_path}: {e}")
        continue

    # Prepare input
    image = ["http://images.cocodataset.org/val2017/000000039769.jpg"]
    prompt = "Describe this image."

    # Apply chat template
    formatted_prompt = apply_chat_template(
        tokenizer, config, prompt, num_images=len(image)
    )

    # Generate output
    try:
        start_time = time.time()
        output = generate(model, tokenizer, formatted_prompt, image, verbose=True, max_tokens=500)
        end_time = time.time()
        print(output)
    except Exception as e:
        print(f"Failed to generate output for model at {model_path}: {e}")
        continue

    mem_after = process.memory_info().rss
    print(f"Output generated in {end_time - start_time:.2f}s")
    print(f"Memory used: {(mem_after - mem_before) / (1024 * 1024 * 1024):.2f} GB")

    print(80 * "^", end="\n\n")
    del model, tokenizer
    mx.metal.clear_cache()

Output

mlx-vlm version: 0.1.7
vvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvv
Running mlx-community/nanoLLaVA-1.5-4bit 
Fetching 11 files: 100%|██████████| 11/11 [00:50<00:00,  4.59s/it]
Fetching 11 files: 100%|██████████| 11/11 [00:00<00:00, 219701.64it/s]
==========
Image: ['http://images.cocodataset.org/val2017/000000039769.jpg'] 

Prompt: <|im_start|>system
Answer the questions.<|im_end|><|im_start|>user
<image>
Describe this image.<|im_end|><|im_start|>assistant

The image shows a close-up view of two cats lying down on a pink fabric surface. Both cats have a striped pattern on their fur, with the one on the left having a darker shade of brown, and the one on the right having a lighter shade of brown. They are positioned in such a way that the left cat is facing the camera, while the right cat is looking away. The cats are lying on their stomachs, and the fabric surface is slightly wrinkled. The image has a sepia tone, which gives it a vintage or antique look. There are no texts or other objects in the image. The style of the image is a straightforward, candid photograph, capturing a moment of relaxation for the cats.
==========
Prompt: 21 tokens, 75.857 tokens-per-sec
Generation: 146 tokens, 143.235 tokens-per-sec
Peak memory: 1.408 GB
The image shows a close-up view of two cats lying down on a pink fabric surface. Both cats have a striped pattern on their fur, with the one on the left having a darker shade of brown, and the one on the right having a lighter shade of brown. They are positioned in such a way that the left cat is facing the camera, while the right cat is looking away. The cats are lying on their stomachs, and the fabric surface is slightly wrinkled. The image has a sepia tone, which gives it a vintage or antique look. There are no texts or other objects in the image. The style of the image is a straightforward, candid photograph, capturing a moment of relaxation for the cats.
Output generated in 2.06s
Memory used: 0.78 GB
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

vvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvv
Running mlx-community/Phi-3.5-vision-instruct-4bit 
Fetching 12 files: 100%|██████████| 12/12 [00:00<00:00, 80530.64it/s]
[/opt/homebrew/Caskroom/miniconda/base/envs/mlx_code/lib/python3.11/site-packages/transformers/models/auto/image_processing_auto.py:524](https://file+.vscode-resource.vscode-cdn.net/opt/homebrew/Caskroom/miniconda/base/envs/mlx_code/lib/python3.11/site-packages/transformers/models/auto/image_processing_auto.py:524): FutureWarning: The image_processor_class argument is deprecated and will be removed in v4.42. Please use `slow_image_processor_class`, or `fast_image_processor_class` instead
  warnings.warn(
Fetching 12 files: 100%|██████████| 12/12 [00:00<00:00, 235194.62it/s]
==========
Image: ['http://images.cocodataset.org/val2017/000000039769.jpg'] 

Prompt: <|user|>
<|image_1|>Describe this image.<|end|>
<|assistant|>

The image shows two cats lying on a pink couch. The cat on the left is a tabby with a mix of dark and light stripes, while the cat on the right is a solid black cat. Both cats have their eyes closed, suggesting they are asleep. The couch has a pink cushion, and there are two remote controls on the couch.<|end|>
==========
Prompt: 771 tokens, 614.139 tokens-per-sec
Generation: 83 tokens, 33.288 tokens-per-sec
Peak memory: 3.704 GB
The image shows two cats lying on a pink couch. The cat on the left is a tabby with a mix of dark and light stripes, while the cat on the right is a solid black cat. Both cats have their eyes closed, suggesting they are asleep. The couch has a pink cushion, and there are two remote controls on the couch.<|end|>
Output generated in 4.51s
Memory used: 2.17 GB
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

vvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvv
Running mlx-community/Qwen2-VL-2B-Instruct-4bit 
Fetching 11 files: 100%|██████████| 11/11 [00:00<00:00, 186037.68it/s]
Fetching 11 files: 100%|██████████| 11/11 [00:00<00:00, 254902.45it/s]
==========
Image: ['http://images.cocodataset.org/val2017/000000039769.jpg'] 

Prompt: <|im_start|>system
You are a helpful assistant.<|im_end|>
<|im_start|>user
Describe this image.<|vision_start|><|image_pad|><|vision_end|><|im_end|>
<|im_start|>assistant

The image shows two cats lying on a pink blanket. The cat on the left is striped with a mix of black, brown, and white, and it is lying on its side with its head resting on the blanket. The cat on the right is also striped, with a mix of black, brown, and white, and it is lying on its back with its head resting on the blanket as well. Both cats appear to be resting or sleeping, and there are two remote controls placed on the blanket next to them.
==========
Prompt: 416 tokens, 732.923 tokens-per-sec
Generation: 105 tokens, 169.277 tokens-per-sec
Peak memory: 3.704 GB
The image shows two cats lying on a pink blanket. The cat on the left is striped with a mix of black, brown, and white, and it is lying on its side with its head resting on the blanket. The cat on the right is also striped, with a mix of black, brown, and white, and it is lying on its back with its head resting on the blanket as well. Both cats appear to be resting or sleeping, and there are two remote controls placed on the blanket next to them.
Output generated in 1.94s
Memory used: 0.58 GB
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

vvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvv
Running HuggingFaceTB/SmolVLM-Instruct 
Fetching 12 files: 100%|██████████| 12/12 [00:00<00:00, 151601.35it/s]
Some kwargs in processor config are unused and will not have any effect: image_seq_len. 
Fetching 12 files: 100%|██████████| 12/12 [00:00<00:00, 181049.09it/s]
==========
Image: ['http://images.cocodataset.org/val2017/000000039769.jpg'] 

Prompt: <|im_start|>User:<image>Describe this image.<end_of_utterance>
Assistant:
 Two cats are sleeping on a pink blanket.
==========
Prompt: 1195 tokens, 702.406 tokens-per-sec
Generation: 10 tokens, 78.259 tokens-per-sec
Peak memory: 6.007 GB
 Two cats are sleeping on a pink blanket.
Output generated in 2.68s
Memory used: 4.22 GB
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

vvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvv
Running mlx-community/Llama-3.2-11B-Vision-Instruct-4bit 
Fetching 9 files: 100%|██████████| 9/9 [00:00<00:00, 145187.45it/s]
Fetching 9 files: 100%|██████████| 9/9 [00:00<00:00, 235929.60it/s]
==========
Image: ['http://images.cocodataset.org/val2017/000000039769.jpg'] 

Prompt: <|begin_of_text|><|start_header_id|>user<|end_header_id|>

Describe this image.<|image|><|eot_id|><|start_header_id|>assistant<|end_header_id|>


The image shows two cats lying on a pink blanket, with two remote controls placed nearby. The cats are positioned in a way that suggests they are watching something on a television, and the remote controls are likely used to control the TV.

* Two cats:
	+ One cat is smaller and has a fluffy tail
	+ The other cat is larger and has a more mottled coat
	+ Both cats are lying on their sides, with their heads turned towards the TV
* Two remote controls:
	+ One remote control is placed near the smaller cat
	+ The other remote control is placed near the larger cat
	+ Both remote controls have a similar design and are likely used to control the TV
* A pink blanket:
	+ The blanket is a bright pink color
	+ It appears to be made of a soft, plush material
	+ The blanket is spread out on a surface, possibly a couch or a bed

Overall, the image suggests that the cats are enjoying a relaxing afternoon, watching something on TV and using the remote controls to control the program.
==========
Prompt: 15 tokens, 2.941 tokens-per-sec
Generation: 221 tokens, 6.223 tokens-per-sec
Peak memory: 16.252 GB
The image shows two cats lying on a pink blanket, with two remote controls placed nearby. The cats are positioned in a way that suggests they are watching something on a television, and the remote controls are likely used to control the TV.

* Two cats:
	+ One cat is smaller and has a fluffy tail
	+ The other cat is larger and has a more mottled coat
	+ Both cats are lying on their sides, with their heads turned towards the TV
* Two remote controls:
	+ One remote control is placed near the smaller cat
	+ The other remote control is placed near the larger cat
	+ Both remote controls have a similar design and are likely used to control the TV
* A pink blanket:
	+ The blanket is a bright pink color
	+ It appears to be made of a soft, plush material
	+ The blanket is spread out on a surface, possibly a couch or a bed

Overall, the image suggests that the cats are enjoying a relaxing afternoon, watching something on TV and using the remote controls to control the program.
Output generated in 41.39s
Memory used: 3.97 GB
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

vvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvv
Running mlx-community/idefics2-8b-4bit 
Fetching 11 files: 100%|██████████| 11/11 [00:00<00:00, 229539.02it/s]
Fetching 11 files: 100%|██████████| 11/11 [00:00<00:00, 169622.59it/s]
==========
Image: ['http://images.cocodataset.org/val2017/000000039769.jpg'] 

Prompt: User: Describe this image.<image><end_of_utterance>
Assistant:
Two house cats are laying on a bed with a pink comforter. They are using remote controllers as toys.<end_of_utterance>
==========
Prompt: 79 tokens, 140.361 tokens-per-sec
Generation: 26 tokens, 44.843 tokens-per-sec
Peak memory: 16.252 GB
Two house cats are laying on a bed with a pink comforter. They are using remote controllers as toys.<end_of_utterance>
Output generated in 1.91s
Memory used: 0.75 GB
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

Note⚠️:

I did found a tiny bug with SmolVLM-instruct that has been fixed here #164 and will be availble in the next release today, after the tests clear.

@Blaizzy
Copy link
Owner Author

Blaizzy commented Dec 30, 2024

@jrp2014 and @sachinraja13 v0.1.8 is out with the fix for SmolVLM-Instruct 🚀

@sachinraja13
Copy link

This is great, thank you so much @Blaizzy !

@Blaizzy
Copy link
Owner Author

Blaizzy commented Dec 30, 2024

My pleasure!

Happy new year in advance 🚀

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
3 participants