LLaVa-NeXT needs tweaking for v4.47 #106

jrp2014 · 2024-10-25T22:14:09Z

I know that you think that Llava has been superseded but I think that it's still pretty good for captioning.

When I use your example script on mlx-community/llava-v1.6-34b-8bit, it warns that:

Expanding inputs for image tokens in LLaVa-NeXT should be done in processing. Please add `patch_size` and `vision_feature_select_strategy` to the model's processing config or set directly with `processor.patch_size = {{patch_size}}` and processor.vision_feature_select_strategy = {{vision_feature_select_strategy}}`. Using processors without these attributes in the config is deprecated and will throw an error in v4.47.

I have no idea what this means, but it'd be great if mlx-vlm could be tweaked to give the model what it wants.

The text was updated successfully, but these errors were encountered:

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

LLaVa-NeXT needs tweaking for v4.47 #106

LLaVa-NeXT needs tweaking for v4.47 #106

jrp2014 commented Oct 25, 2024

LLaVa-NeXT needs tweaking for v4.47 #106

LLaVa-NeXT needs tweaking for v4.47 #106

Comments

jrp2014 commented Oct 25, 2024