feat: pass an image as part of the evaluation #88

giladgd · 2023-11-05T16:15:47Z

When llama.cpp's support for this will be stable.
Hopefully, there will be an official API for this after ggml-org/llama.cpp#11292 is implemented.

The text was updated successfully, but these errors were encountered:

samlhuillier · 2023-11-22T22:57:05Z

Interested in this kind of multimodal support. Any update on progress?

fozziethebeat · 2023-12-02T03:07:41Z

Does this encompass adding support for running llava models or should that be a separate feature request? I noticed that llama-cpp-python already includes llava support from llama.cpp so this shouldn't be too hard with setting up the bindings.

giladgd · 2023-12-03T23:14:27Z

I haven't started working on this yet, but it is planned as part of the roadmap.
The plan is to add support for llama.cpp's ability to pass an image to a model, which right now only supports LLaVA.

I'll work on this once llama.cpp's API for this is final, to prevent frequent breaking API changes (unlike what happens on some other libraries)

fozziethebeat · 2023-12-05T02:33:08Z

Make sense. Hopefully llama.cpp finalizes that API.

AlexTech314 · 2025-03-27T18:16:57Z

Any update on this? Would love to leverage multimodal models! Love the library so far :)

giladgd · 2025-03-28T00:04:21Z

An official API for this is in active development on llama.cpp; one it's ready we can start working on adding support for it in node-llama-cpp.
It appears that the API being worked on will also include support for other modalities, such as audio and video, so this is going to be a major feature once it lands (node-llama-cpp will support everything).

AlexTech314 · 2025-03-28T00:06:54Z

An official API for this is in active development on llama.cpp; one it's ready we can start working on adding support for it in node-llama-cpp.
It appears that the API being worked on will also include support for other modalities, such as audio and video, so this is going to be a major feature once it lands (node-llama-cpp will support everything).

Beautiful. Looking forward to it, this library is insane.

wisng · 2025-04-18T02:32:50Z

Any updates on this feature, it seems there were some experimental support for gemma 3 vision last week ggml-org/llama.cpp#12344

giladgd · 2025-05-09T00:07:13Z

@wisng I'm waiting for an official stable API for this, which is still in the works.

Mihailoff · 2025-05-15T22:25:37Z

🔥 Multimodal support arrived in llama-server: ggml-org/llama.cpp#12898 | documentation

Perhaps we can't call it stable, but it is there now.

giladgd · 2025-05-16T01:40:57Z

I've started poking around with the mtmd API to integrate multimodality into node-llama-cpp.
There were too many breaking changes around it recently, so I'll wait a bit more before I spend more time integrating it, but it's coming up!
Can't commit to a timeline yet, but I'll release a few beta versions with it before a stable release to gather feedback and iron out any bugs.

giladgd added this to node-llama-cpp: roadmap Nov 5, 2023

giladgd self-assigned this Nov 5, 2023

giladgd converted this from a draft issue Nov 5, 2023

giladgd added new feature New feature or request roadmap Part of the roadmap for node-llama-cpp (https://github.com/orgs/withcatai/projects/1) labels Nov 5, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: pass an image as part of the evaluation #88

feat: pass an image as part of the evaluation #88

giladgd commented Nov 5, 2023 •

edited

Loading

samlhuillier commented Nov 22, 2023

fozziethebeat commented Dec 2, 2023

giladgd commented Dec 3, 2023

fozziethebeat commented Dec 5, 2023

AlexTech314 commented Mar 27, 2025

giladgd commented Mar 28, 2025

AlexTech314 commented Mar 28, 2025

wisng commented Apr 18, 2025

giladgd commented May 9, 2025

Mihailoff commented May 15, 2025

giladgd commented May 16, 2025

feat: pass an image as part of the evaluation #88

feat: pass an image as part of the evaluation #88

Comments

giladgd commented Nov 5, 2023 • edited Loading

samlhuillier commented Nov 22, 2023

fozziethebeat commented Dec 2, 2023

giladgd commented Dec 3, 2023

fozziethebeat commented Dec 5, 2023

AlexTech314 commented Mar 27, 2025

giladgd commented Mar 28, 2025

AlexTech314 commented Mar 28, 2025

wisng commented Apr 18, 2025

giladgd commented May 9, 2025

Mihailoff commented May 15, 2025

giladgd commented May 16, 2025

giladgd commented Nov 5, 2023 •

edited

Loading