Which specific models work with this framework? #80

jrp2014 · 2024-10-11T20:44:59Z

This is a nice framework to use for image analysis / captioning, etc.

Is there a doc somewhere that sets out which models, specifically can be driven through this app/library? When you say "Pixtral", eg, which of the versions should work (without further conversion, on what size of machine)?

I know that you say that Lava is no longer state of the art, but what is better?

Thanks.

Otherwise I get errors like

(mlx) ➜  mlx_vlm git:(main) ✗ python mytest.py
Fetching 3 files: 100%|█████████████████████████████████████████████████████████████| 3/3 [00:00<00:00, 6772.29it/s]
ERROR:root:Config file not found in /Users/jrp/.cache/huggingface/hub/models--mistralai--Pixtral-12B-2409/snapshots/df119bf36c0cedc6ffdc9ca6c58ebf51f9771ef7
Traceback (most recent call last):
  File "/Users/zzz/Documents/AI/mlx/mlx-vlm/mlx_vlm/mytest.py", line 12, in <module>
    model, processor = load(model_path)
                       ^^^^^^^^^^^^^^^^
  File "/opt/homebrew/Caskroom/miniconda/base/envs/mlx/lib/python3.12/site-packages/mlx_vlm/utils.py", line 251, in load
    model = load_model(model_path, lazy)
            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/opt/homebrew/Caskroom/miniconda/base/envs/mlx/lib/python3.12/site-packages/mlx_vlm/utils.py", line 116, in load_model
    config = load_config(model_path)
             ^^^^^^^^^^^^^^^^^^^^^^^
  File "/opt/homebrew/Caskroom/miniconda/base/envs/mlx/lib/python3.12/site-packages/mlx_vlm/utils.py", line 268, in load_config
    with open(model_path / "config.json", "r") as f:
         ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
FileNotFoundError: [Errno 2] No such file or directory: '/Users/zzz/.cache/huggingface/hub/models--mistralai--Pixtral-12B-2409/snapshots/df119bf36c0cedc6ffdc9ca6c58ebf51f9771ef7/config.json'

The text was updated successfully, but these errors were encountered:

Blaizzy · 2024-10-12T14:42:03Z

@jrp2014 good question!

In general you can find the correct models in the mlx-community repo. They are usually converted and uploaded there before the release.

We currently support the Pixtral version from the mistral-community. This version is formatted like llava.

https://huggingface.co/mistral-community/pixtral-12b

jrp2014 · 2024-10-12T17:11:23Z

Thanks. I don't find the search function on hugging face particularly easy to use.

jrp2014 · 2024-10-12T18:15:52Z

Not sure what's going wrong here:

import mlx.core as mx
from mlx_vlm import load, generate
from mlx_vlm.prompt_utils import apply_chat_template
from mlx_vlm.utils import load_config

# Load the model
model_path = "mistral-community/pixtral-12b"
model, processor = load(model_path)
config = load_config(model_path)

# Prepare input
image = ["http://images.cocodataset.org/val2017/000000039769.jpg"]
prompt = "Describe this image."

# Apply chat template
formatted_prompt = apply_chat_template(
    processor, config, prompt, num_images=len(image)
)

# Generate output
output = generate(model, processor, image, formatted_prompt, verbose=False)
print(output)

results in

Fetching 15 files: 100%|█████████████████████████████████████████████████████████| 15/15 [00:00<00:00, 28688.81it/s]
Traceback (most recent call last):
  File "/Users/jrp/Documents/AI/mlx/mlx-vlm/mlx_vlm/mytest3.py", line 8, in <module>
    model, processor = load(model_path)
                       ^^^^^^^^^^^^^^^^
  File "/opt/homebrew/Caskroom/miniconda/base/envs/mlx/lib/python3.12/site-packages/mlx_vlm/utils.py", line 251, in load
    model = load_model(model_path, lazy)
            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/opt/homebrew/Caskroom/miniconda/base/envs/mlx/lib/python3.12/site-packages/mlx_vlm/utils.py", line 189, in load_model
    model = model_class.Model(model_config)
            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/opt/homebrew/Caskroom/miniconda/base/envs/mlx/lib/python3.12/site-packages/mlx_vlm/models/llava/llava.py", line 61, in __init__
    self.vision_tower = VisionModel(config.vision_config)
                        ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/opt/homebrew/Caskroom/miniconda/base/envs/mlx/lib/python3.12/site-packages/mlx_vlm/models/llava/vision.py", line 232, in __init__
    raise ValueError(f"Unsupported model type: {self.model_type}")
ValueError: Unsupported model type: pixtral

This is been run from the latest mlx_vlm directory

Blaizzy · 2024-10-12T18:55:32Z

Install from source.

I recently merged a PR fixing all the bugs

jrp2014 · 2024-10-12T18:56:29Z

yes, that's what I am doing.

Blaizzy · 2024-10-12T18:57:16Z

pip install git+https://github.com/Blaizzy/mlx-vlm.git

Blaizzy · 2024-10-12T19:50:49Z

Uninstall and reinstall from source.

It seems you have an older version.

Check the version you have installed.

Blaizzy · 2024-10-12T19:51:47Z

Let me know if the issue persists with version 0.1.0

jrp2014 · 2024-10-12T22:20:40Z

Is there a way of checking what version is being run from the python script?

Successfully built mlx-vlm
Installing collected packages: mlx-vlm
Successfully installed mlx-vlm-0.1.0

Fails as above.

Blaizzy · 2024-10-12T22:27:53Z

Try

pip list | grep mlx

Blaizzy · 2024-10-12T22:32:07Z

Can you try to run this in your terminal

python -m mlx_vlm.generate --model mistral-community/pixtral-12b --max-tokens 100 --temp 0.0 --prompt
 "What animal is this?"

jrp2014 · 2024-10-12T23:18:59Z

Still no, go, I'm afraid. No doubt it is something about my setup, but I can't see what it could be; it's built straight from a clone of your GitHub repository.

python -m mlx_vlm.generate --model mistral-community/pixtral-12b --max-tokens 100 --temp 0.0 --prompt 'What animal is this?'
Fetching 15 files: 100%|█████████████████████████████████████████████████████████| 15/15 [00:00<00:00, 35226.52it/s]
Traceback (most recent call last):
  File "<frozen runpy>", line 198, in _run_module_as_main
  File "<frozen runpy>", line 88, in _run_code
  File "/opt/homebrew/Caskroom/miniconda/base/envs/mlx/lib/python3.12/site-packages/mlx_vlm/generate.py", line 96, in <module>
    main()
  File "/opt/homebrew/Caskroom/miniconda/base/envs/mlx/lib/python3.12/site-packages/mlx_vlm/generate.py", line 73, in main
    model, processor, image_processor, config = get_model_and_processors(
                                                ^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/opt/homebrew/Caskroom/miniconda/base/envs/mlx/lib/python3.12/site-packages/mlx_vlm/generate.py", line 61, in get_model_and_processors
    model, processor = load(
                       ^^^^^
  File "/opt/homebrew/Caskroom/miniconda/base/envs/mlx/lib/python3.12/site-packages/mlx_vlm/utils.py", line 251, in load
    model = load_model(model_path, lazy)
            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/opt/homebrew/Caskroom/miniconda/base/envs/mlx/lib/python3.12/site-packages/mlx_vlm/utils.py", line 189, in load_model
    model = model_class.Model(model_config)
            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/opt/homebrew/Caskroom/miniconda/base/envs/mlx/lib/python3.12/site-packages/mlx_vlm/models/llava/llava.py", line 61, in __init__
    self.vision_tower = VisionModel(config.vision_config)
                        ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/opt/homebrew/Caskroom/miniconda/base/envs/mlx/lib/python3.12/site-packages/mlx_vlm/models/llava/vision.py", line 232, in __init__
    raise ValueError(f"Unsupported model type: {self.model_type}")
ValueError: Unsupported model type: pixtral

Blaizzy · 2024-10-13T01:46:24Z

Please share the result of

pip list | grep mlx

jrp2014 · 2024-10-13T09:45:52Z

lightning-whisper-mlx     0.0.10
mlx                       0.18.1.dev20241011+c21331d4
mlx-data                  0.0.2
mlx-lm                    0.19.1
mlx-vlm                   0.1.0
mlx-whisper               0.3.0

Blaizzy · 2024-10-13T10:08:12Z

Try this model and let me know if the issue persists.

mlx-community/pixtral-12b-8bit

Blaizzy · 2024-10-13T10:11:17Z

Something doesn't add up because your logs are saying the model is loading using llava arch instead of pixtral.

Blaizzy · 2024-10-13T10:16:38Z

I will give it a look.

jrp2014 · 2024-10-13T12:24:24Z

Try this model and let me know if the issue persists.

mlx-community/pixtral-12b-8bit

Well this one doesn't just crash out, but it just spins, without producing an answer, either from the command line or via the script above.

python -m mlx_vlm.generate --model mlx-community/pixtral-12b-8bit --max-tokens 100 --temp 0.0

Fetching 11 files: 100%|█████████████████████████████████████████████████████████| 11/11 [00:00<00:00, 44706.73it/s]
==========
Image: ['http://images.cocodataset.org/val2017/000000039769.jpg'] 

Prompt: <s>[INST]What are these?[IMG][/INST]

Blaizzy · 2024-10-13T12:46:46Z

Found the issue!

This version points to llava in the model config. I patched it locally.

Don't worry, I will add a condition to fix this at load time.

https://huggingface.co/mistral-community/pixtral-12b/blob/main/config.json

Blaizzy · 2024-10-13T12:48:54Z

Well this one doesn't just crash out, but it just spins, without producing an answer, either from the command line or via the script above.

What are the specs of your machine?

Try to pass --resize-shape 128 128 or --resize-shape 224 224

Blaizzy · 2024-10-13T12:51:12Z

Also try the 4bit version instead of the 8bit.

mlx-community/pixtral-12b-4bit

Blaizzy · 2024-10-13T14:53:10Z

Found the issue!
This version points to llava in the model config. I patched it locally.
Don't worry, I will add a condition to fix this at load time.

On second thought, I don't think it's a good idea to add a condition for one model.

You can use all models already converted in mlx-community repo (4bit, 8bit and bf16). Otherwise, to use the mistral-community model, you just have to change the config.json model_type from llava to pixtral.

jrp2014 · 2024-10-13T17:25:16Z

OK. Thanks. It'd be good to document some of these points up front as the connection between the model names used here and the various hugging face repositories is a little tenuous, for new users.

Blaizzy · 2024-10-13T18:03:53Z

Could you help me with that ?

Also, perhaps adding a way to scan for models on the mlx-community based on names ?

jrp2014 · 2024-10-18T20:24:47Z

Sorry, but the models are too big for me to download and test comprehensively. I suggest that when you put up a new model type you give an example of the model that you used to test the addition. Also you could just point to the hugging face models that you have put up.

(My setup now seems to work again, staring from a fresh clone. Perhaps I shouldn't use iCloud to transfer my files between machines.)

But with the Mistral repo, which now has a config file, when replacing the model_type with llava, I still get

> python mytest.py
Fetching 15 files: 100%|█████████████████████████████████████████████████████████| 15/15 [00:00<00:00, 15352.50it/s]
Traceback (most recent call last):
  File "/Users/xxx/Documents/AI/mlx/scripts/vlm/mytest.py", line 19, in <module>
    model, processor = load(model_path)
                       ^^^^^^^^^^^^^^^^
  File "/opt/homebrew/Caskroom/miniconda/base/envs/mlx/lib/python3.12/site-packages/mlx_vlm/utils.py", line 251, in load
    model = load_model(model_path, lazy)
            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/opt/homebrew/Caskroom/miniconda/base/envs/mlx/lib/python3.12/site-packages/mlx_vlm/utils.py", line 189, in load_model
    model = model_class.Model(model_config)
            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/opt/homebrew/Caskroom/miniconda/base/envs/mlx/lib/python3.12/site-packages/mlx_vlm/models/llava/llava.py", line 61, in __init__
    self.vision_tower = VisionModel(config.vision_config)
                        ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/opt/homebrew/Caskroom/miniconda/base/envs/mlx/lib/python3.12/site-packages/mlx_vlm/models/llava/vision.py", line 232, in __init__
    raise ValueError(f"Unsupported model type: {self.model_type}")
ValueError: Unsupported model type: llava

Blaizzy self-assigned this Oct 16, 2024

Blaizzy added the enhancement New feature or request label Oct 16, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Which specific models work with this framework? #80

Which specific models work with this framework? #80

jrp2014 commented Oct 11, 2024 •

edited

Loading

Blaizzy commented Oct 12, 2024

jrp2014 commented Oct 12, 2024

jrp2014 commented Oct 12, 2024 •

edited

Loading

Blaizzy commented Oct 12, 2024

jrp2014 commented Oct 12, 2024

Blaizzy commented Oct 12, 2024

Blaizzy commented Oct 12, 2024

Blaizzy commented Oct 12, 2024

jrp2014 commented Oct 12, 2024

Blaizzy commented Oct 12, 2024

Blaizzy commented Oct 12, 2024

jrp2014 commented Oct 12, 2024

Blaizzy commented Oct 13, 2024

jrp2014 commented Oct 13, 2024

Blaizzy commented Oct 13, 2024

Blaizzy commented Oct 13, 2024 •

edited

Loading

Blaizzy commented Oct 13, 2024

jrp2014 commented Oct 13, 2024

Blaizzy commented Oct 13, 2024

Blaizzy commented Oct 13, 2024

Blaizzy commented Oct 13, 2024

Blaizzy commented Oct 13, 2024 •

edited

Loading

jrp2014 commented Oct 13, 2024

Blaizzy commented Oct 13, 2024

jrp2014 commented Oct 18, 2024 •

edited

Loading

Which specific models work with this framework? #80

Which specific models work with this framework? #80

Comments

jrp2014 commented Oct 11, 2024 • edited Loading

Blaizzy commented Oct 12, 2024

jrp2014 commented Oct 12, 2024

jrp2014 commented Oct 12, 2024 • edited Loading

Blaizzy commented Oct 12, 2024

jrp2014 commented Oct 12, 2024

Blaizzy commented Oct 12, 2024

Blaizzy commented Oct 12, 2024

Blaizzy commented Oct 12, 2024

jrp2014 commented Oct 12, 2024

Blaizzy commented Oct 12, 2024

Blaizzy commented Oct 12, 2024

jrp2014 commented Oct 12, 2024

Blaizzy commented Oct 13, 2024

jrp2014 commented Oct 13, 2024

Blaizzy commented Oct 13, 2024

Blaizzy commented Oct 13, 2024 • edited Loading

Blaizzy commented Oct 13, 2024

jrp2014 commented Oct 13, 2024

Blaizzy commented Oct 13, 2024

Blaizzy commented Oct 13, 2024

Blaizzy commented Oct 13, 2024

Blaizzy commented Oct 13, 2024 • edited Loading

jrp2014 commented Oct 13, 2024

Blaizzy commented Oct 13, 2024

jrp2014 commented Oct 18, 2024 • edited Loading

jrp2014 commented Oct 11, 2024 •

edited

Loading

jrp2014 commented Oct 12, 2024 •

edited

Loading

Blaizzy commented Oct 13, 2024 •

edited

Loading

Blaizzy commented Oct 13, 2024 •

edited

Loading

jrp2014 commented Oct 18, 2024 •

edited

Loading