Skip to content

do_picture_description doesnt work with local HuggingFaceTB--SmolVLM-256M-Instruct #204

Closed
@khabibulloevm

Description

@khabibulloevm

Hi @dolfim-ibm, I'm trying to use do_picture_description with a locally downloaded model (HuggingFaceTB--SmolVLM-256M-Instruct) via the /v1alpha/convert/file/async endpoint.

I'm passing the parameter as:

"do_picture_description": true,
"picture_description_local": {
  "repo_id": "/docling-models/HuggingFaceTB--SmolVLM-256M-Instruct"
}

However, I consistently receive a 422 error:

Input should be a valid dictionary or object to extract fields from

Before calling the endpoint, I downloaded all required models using docling-tools models download -o /docling-models, and I’m launching the API with:

docling-serve run --artifacts-path /docling-models

Here are the model folders I have under /docling-models:

Image

I saw in another issue that do_picture_description might not yet be fully supported in /v1alpha/convert/file/async.

Could you please confirm:

Whether do_picture_description is supported for this endpoint?

Whether using a local model for picture_description_local should work in this context?

If not, is there an alternative endpoint or method recommended for image captioning with local models?

Thanks in advance for the clarification!

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions