-
Notifications
You must be signed in to change notification settings - Fork 539
Deploying VITA-1.5 Multimodal Model with ExecuTorch #10757
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
cc: @larryliu0820 for MM |
Hi @jordanqi, if you haven't joined our discord channel, we would love to have you on there :) |
I haven't joined the discord, please send the channel link to me thanks |
Is this model from HF? @guangy10 maybe know
If it can run with desktop llama_runner out of box then it can run with LlamaDemo Android app, but not sure about image processing and prompt format part |
We support Hugging Face tokenizer.json format right now |
Hi @jordanqi thanks for your interest. A lot of the features you are asking for are under development, so there might not be convenient ways to do them, but I believe with a bit of work you can make these working!
Yes I would suggest you to take a look at the Llava example for multimodal LLMs: https://github.com/pytorch/executorch/blob/main/examples/models/llava/export_llava.py We are actively working on a more general API that works on models with the same architecture
Again please refer to the Llava example. We are open to take a new model under example directory, as long as it works.
Most likely you can use the demo app for image prefill, but to support audio we need a lot of work. Also keep in mind that right now the model and the runner are coupled so it's really a case by case situation. |
🚀 The feature, motivation and pitch
I’m trying to deploy a VITA-1.5 multimodal model (supports audio, vision, and text) using ExecuTorch.
The tokenizer is in Hugging Face tokenizer.json format, and I’d like to ask:
Alternatives
No response
Additional context
No response
RFC (Optional)
No response
cc @larryliu0820 @mergennachin @cccclai @helunwencser @jackzhxng
The text was updated successfully, but these errors were encountered: