Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

LLaVa-NeXT needs tweaking for v4.47 #106

Open
jrp2014 opened this issue Oct 25, 2024 · 0 comments
Open

LLaVa-NeXT needs tweaking for v4.47 #106

jrp2014 opened this issue Oct 25, 2024 · 0 comments

Comments

@jrp2014
Copy link

jrp2014 commented Oct 25, 2024

I know that you think that Llava has been superseded but I think that it's still pretty good for captioning.

When I use your example script on mlx-community/llava-v1.6-34b-8bit, it warns that:

Expanding inputs for image tokens in LLaVa-NeXT should be done in processing. Please add `patch_size` and `vision_feature_select_strategy` to the model's processing config or set directly with `processor.patch_size = {{patch_size}}` and processor.vision_feature_select_strategy = {{vision_feature_select_strategy}}`. Using processors without these attributes in the config is deprecated and will throw an error in v4.47.

I have no idea what this means, but it'd be great if mlx-vlm could be tweaked to give the model what it wants.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant