Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Which specific models work with this framework? #80

Open
jrp2014 opened this issue Oct 11, 2024 · 25 comments
Open

Which specific models work with this framework? #80

jrp2014 opened this issue Oct 11, 2024 · 25 comments
Assignees
Labels
enhancement New feature or request

Comments

@jrp2014
Copy link

jrp2014 commented Oct 11, 2024

This is a nice framework to use for image analysis / captioning, etc.

Is there a doc somewhere that sets out which models, specifically can be driven through this app/library? When you say "Pixtral", eg, which of the versions should work (without further conversion, on what size of machine)?

I know that you say that Lava is no longer state of the art, but what is better?

Thanks.

Otherwise I get errors like

(mlx) ➜  mlx_vlm git:(main) ✗ python mytest.py
Fetching 3 files: 100%|█████████████████████████████████████████████████████████████| 3/3 [00:00<00:00, 6772.29it/s]
ERROR:root:Config file not found in /Users/jrp/.cache/huggingface/hub/models--mistralai--Pixtral-12B-2409/snapshots/df119bf36c0cedc6ffdc9ca6c58ebf51f9771ef7
Traceback (most recent call last):
  File "/Users/zzz/Documents/AI/mlx/mlx-vlm/mlx_vlm/mytest.py", line 12, in <module>
    model, processor = load(model_path)
                       ^^^^^^^^^^^^^^^^
  File "/opt/homebrew/Caskroom/miniconda/base/envs/mlx/lib/python3.12/site-packages/mlx_vlm/utils.py", line 251, in load
    model = load_model(model_path, lazy)
            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/opt/homebrew/Caskroom/miniconda/base/envs/mlx/lib/python3.12/site-packages/mlx_vlm/utils.py", line 116, in load_model
    config = load_config(model_path)
             ^^^^^^^^^^^^^^^^^^^^^^^
  File "/opt/homebrew/Caskroom/miniconda/base/envs/mlx/lib/python3.12/site-packages/mlx_vlm/utils.py", line 268, in load_config
    with open(model_path / "config.json", "r") as f:
         ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
FileNotFoundError: [Errno 2] No such file or directory: '/Users/zzz/.cache/huggingface/hub/models--mistralai--Pixtral-12B-2409/snapshots/df119bf36c0cedc6ffdc9ca6c58ebf51f9771ef7/config.json'
@Blaizzy
Copy link
Owner

Blaizzy commented Oct 12, 2024

@jrp2014 good question!

In general you can find the correct models in the mlx-community repo. They are usually converted and uploaded there before the release.

We currently support the Pixtral version from the mistral-community. This version is formatted like llava.

https://huggingface.co/mistral-community/pixtral-12b

@jrp2014
Copy link
Author

jrp2014 commented Oct 12, 2024

Thanks. I don't find the search function on hugging face particularly easy to use.

@jrp2014
Copy link
Author

jrp2014 commented Oct 12, 2024

Not sure what's going wrong here:

import mlx.core as mx
from mlx_vlm import load, generate
from mlx_vlm.prompt_utils import apply_chat_template
from mlx_vlm.utils import load_config

# Load the model
model_path = "mistral-community/pixtral-12b"
model, processor = load(model_path)
config = load_config(model_path)

# Prepare input
image = ["http://images.cocodataset.org/val2017/000000039769.jpg"]
prompt = "Describe this image."

# Apply chat template
formatted_prompt = apply_chat_template(
    processor, config, prompt, num_images=len(image)
)

# Generate output
output = generate(model, processor, image, formatted_prompt, verbose=False)
print(output)

results in

Fetching 15 files: 100%|█████████████████████████████████████████████████████████| 15/15 [00:00<00:00, 28688.81it/s]
Traceback (most recent call last):
  File "/Users/jrp/Documents/AI/mlx/mlx-vlm/mlx_vlm/mytest3.py", line 8, in <module>
    model, processor = load(model_path)
                       ^^^^^^^^^^^^^^^^
  File "/opt/homebrew/Caskroom/miniconda/base/envs/mlx/lib/python3.12/site-packages/mlx_vlm/utils.py", line 251, in load
    model = load_model(model_path, lazy)
            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/opt/homebrew/Caskroom/miniconda/base/envs/mlx/lib/python3.12/site-packages/mlx_vlm/utils.py", line 189, in load_model
    model = model_class.Model(model_config)
            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/opt/homebrew/Caskroom/miniconda/base/envs/mlx/lib/python3.12/site-packages/mlx_vlm/models/llava/llava.py", line 61, in __init__
    self.vision_tower = VisionModel(config.vision_config)
                        ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/opt/homebrew/Caskroom/miniconda/base/envs/mlx/lib/python3.12/site-packages/mlx_vlm/models/llava/vision.py", line 232, in __init__
    raise ValueError(f"Unsupported model type: {self.model_type}")
ValueError: Unsupported model type: pixtral

This is been run from the latest mlx_vlm directory

@Blaizzy
Copy link
Owner

Blaizzy commented Oct 12, 2024

Install from source.

I recently merged a PR fixing all the bugs

@jrp2014
Copy link
Author

jrp2014 commented Oct 12, 2024

yes, that's what I am doing.

@Blaizzy
Copy link
Owner

Blaizzy commented Oct 12, 2024

pip install git+https://github.com/Blaizzy/mlx-vlm.git

@Blaizzy
Copy link
Owner

Blaizzy commented Oct 12, 2024

Uninstall and reinstall from source.

It seems you have an older version.

Check the version you have installed.

@Blaizzy
Copy link
Owner

Blaizzy commented Oct 12, 2024

Let me know if the issue persists with version 0.1.0

@jrp2014
Copy link
Author

jrp2014 commented Oct 12, 2024

Is there a way of checking what version is being run from the python script?

Successfully built mlx-vlm
Installing collected packages: mlx-vlm
Successfully installed mlx-vlm-0.1.0

Fails as above.

@Blaizzy
Copy link
Owner

Blaizzy commented Oct 12, 2024

Try

pip list | grep mlx 

@Blaizzy
Copy link
Owner

Blaizzy commented Oct 12, 2024

Can you try to run this in your terminal

python -m mlx_vlm.generate --model mistral-community/pixtral-12b --max-tokens 100 --temp 0.0 --prompt
 "What animal is this?"

@jrp2014
Copy link
Author

jrp2014 commented Oct 12, 2024

Still no, go, I'm afraid. No doubt it is something about my setup, but I can't see what it could be; it's built straight from a clone of your GitHub repository.

python -m mlx_vlm.generate --model mistral-community/pixtral-12b --max-tokens 100 --temp 0.0 --prompt 'What animal is this?'
Fetching 15 files: 100%|█████████████████████████████████████████████████████████| 15/15 [00:00<00:00, 35226.52it/s]
Traceback (most recent call last):
  File "<frozen runpy>", line 198, in _run_module_as_main
  File "<frozen runpy>", line 88, in _run_code
  File "/opt/homebrew/Caskroom/miniconda/base/envs/mlx/lib/python3.12/site-packages/mlx_vlm/generate.py", line 96, in <module>
    main()
  File "/opt/homebrew/Caskroom/miniconda/base/envs/mlx/lib/python3.12/site-packages/mlx_vlm/generate.py", line 73, in main
    model, processor, image_processor, config = get_model_and_processors(
                                                ^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/opt/homebrew/Caskroom/miniconda/base/envs/mlx/lib/python3.12/site-packages/mlx_vlm/generate.py", line 61, in get_model_and_processors
    model, processor = load(
                       ^^^^^
  File "/opt/homebrew/Caskroom/miniconda/base/envs/mlx/lib/python3.12/site-packages/mlx_vlm/utils.py", line 251, in load
    model = load_model(model_path, lazy)
            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/opt/homebrew/Caskroom/miniconda/base/envs/mlx/lib/python3.12/site-packages/mlx_vlm/utils.py", line 189, in load_model
    model = model_class.Model(model_config)
            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/opt/homebrew/Caskroom/miniconda/base/envs/mlx/lib/python3.12/site-packages/mlx_vlm/models/llava/llava.py", line 61, in __init__
    self.vision_tower = VisionModel(config.vision_config)
                        ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/opt/homebrew/Caskroom/miniconda/base/envs/mlx/lib/python3.12/site-packages/mlx_vlm/models/llava/vision.py", line 232, in __init__
    raise ValueError(f"Unsupported model type: {self.model_type}")
ValueError: Unsupported model type: pixtral

@Blaizzy
Copy link
Owner

Blaizzy commented Oct 13, 2024

Please share the result of

pip list | grep mlx 

@jrp2014
Copy link
Author

jrp2014 commented Oct 13, 2024

lightning-whisper-mlx     0.0.10
mlx                       0.18.1.dev20241011+c21331d4
mlx-data                  0.0.2
mlx-lm                    0.19.1
mlx-vlm                   0.1.0
mlx-whisper               0.3.0

@Blaizzy
Copy link
Owner

Blaizzy commented Oct 13, 2024

Try this model and let me know if the issue persists.

mlx-community/pixtral-12b-8bit

@Blaizzy
Copy link
Owner

Blaizzy commented Oct 13, 2024

Something doesn't add up because your logs are saying the model is loading using llava arch instead of pixtral.

@Blaizzy
Copy link
Owner

Blaizzy commented Oct 13, 2024

I will give it a look.

@jrp2014
Copy link
Author

jrp2014 commented Oct 13, 2024

Try this model and let me know if the issue persists.

mlx-community/pixtral-12b-8bit

Well this one doesn't just crash out, but it just spins, without producing an answer, either from the command line or via the script above.

python -m mlx_vlm.generate --model mlx-community/pixtral-12b-8bit --max-tokens 100 --temp 0.0

Fetching 11 files: 100%|█████████████████████████████████████████████████████████| 11/11 [00:00<00:00, 44706.73it/s]
==========
Image: ['http://images.cocodataset.org/val2017/000000039769.jpg'] 

Prompt: <s>[INST]What are these?[IMG][/INST]

@Blaizzy
Copy link
Owner

Blaizzy commented Oct 13, 2024

Found the issue!

This version points to llava in the model config. I patched it locally.

Don't worry, I will add a condition to fix this at load time.

https://huggingface.co/mistral-community/pixtral-12b/blob/main/config.json

@Blaizzy
Copy link
Owner

Blaizzy commented Oct 13, 2024

Well this one doesn't just crash out, but it just spins, without producing an answer, either from the command line or via the script above.

What are the specs of your machine?

Try to pass --resize-shape 128 128 or --resize-shape 224 224

@Blaizzy
Copy link
Owner

Blaizzy commented Oct 13, 2024

Also try the 4bit version instead of the 8bit.

mlx-community/pixtral-12b-4bit

@Blaizzy
Copy link
Owner

Blaizzy commented Oct 13, 2024

Found the issue!
This version points to llava in the model config. I patched it locally.
Don't worry, I will add a condition to fix this at load time.

On second thought, I don't think it's a good idea to add a condition for one model.

You can use all models already converted in mlx-community repo (4bit, 8bit and bf16). Otherwise, to use the mistral-community model, you just have to change the config.json model_type from llava to pixtral.

@jrp2014
Copy link
Author

jrp2014 commented Oct 13, 2024

OK. Thanks. It'd be good to document some of these points up front as the connection between the model names used here and the various hugging face repositories is a little tenuous, for new users.

@Blaizzy
Copy link
Owner

Blaizzy commented Oct 13, 2024

Could you help me with that ?

Also, perhaps adding a way to scan for models on the mlx-community based on names ?

@Blaizzy Blaizzy self-assigned this Oct 16, 2024
@Blaizzy Blaizzy added the enhancement New feature or request label Oct 16, 2024
@jrp2014
Copy link
Author

jrp2014 commented Oct 18, 2024

Sorry, but the models are too big for me to download and test comprehensively. I suggest that when you put up a new model type you give an example of the model that you used to test the addition. Also you could just point to the hugging face models that you have put up.

(My setup now seems to work again, staring from a fresh clone. Perhaps I shouldn't use iCloud to transfer my files between machines.)

But with the Mistral repo, which now has a config file, when replacing the model_type with llava, I still get

> python mytest.py
Fetching 15 files: 100%|█████████████████████████████████████████████████████████| 15/15 [00:00<00:00, 15352.50it/s]
Traceback (most recent call last):
  File "/Users/xxx/Documents/AI/mlx/scripts/vlm/mytest.py", line 19, in <module>
    model, processor = load(model_path)
                       ^^^^^^^^^^^^^^^^
  File "/opt/homebrew/Caskroom/miniconda/base/envs/mlx/lib/python3.12/site-packages/mlx_vlm/utils.py", line 251, in load
    model = load_model(model_path, lazy)
            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/opt/homebrew/Caskroom/miniconda/base/envs/mlx/lib/python3.12/site-packages/mlx_vlm/utils.py", line 189, in load_model
    model = model_class.Model(model_config)
            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/opt/homebrew/Caskroom/miniconda/base/envs/mlx/lib/python3.12/site-packages/mlx_vlm/models/llava/llava.py", line 61, in __init__
    self.vision_tower = VisionModel(config.vision_config)
                        ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/opt/homebrew/Caskroom/miniconda/base/envs/mlx/lib/python3.12/site-packages/mlx_vlm/models/llava/vision.py", line 232, in __init__
    raise ValueError(f"Unsupported model type: {self.model_type}")
ValueError: Unsupported model type: llava

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

2 participants