Skip to content

GPU enabled Image fails on Azure Container App #208

Closed
@Fftis

Description

@Fftis

I pulled the image docling-serve-cu124, pushed it to an Azure Container App with enabled GPU and tried to convert a file using /v1alpha/convert/source endpoint.
But I got an error 404 and

{
  "detail": "Task result not found. Please wait for a completion status."
}

In order to get more error information, I pulled the repo, added some debug messages, created another docker image and used it for the container app.
These are the errors I got.

--- Processing url endpoint...
Sources being passed to convert_documents: ['https://arxiv.org/pdf/2408.09869']
ERROR:docling_serve.engines.async_local.worker:Worker 1 failed to process job f9891b2b-56b0-407f-880c-e7a84205abb3: 500: [digital envelope routines] unsupported
ERROR:docling_serve.engines.async_local.worker:Traceback (most recent call last):
  File "/opt/app-root/src/docling_serve/response_preparation.py", line 136, in process_results
    conv_results = list(conv_results)
                   ^^^^^^^^^^^^^^^^^^
  File "/opt/app-root/lib64/python3.12/site-packages/docling/document_converter.py", line 243, in convert_all
    for conv_res in conv_res_iter:
                    ^^^^^^^^^^^^^
  File "/opt/app-root/lib64/python3.12/site-packages/docling/document_converter.py", line 265, in _convert
    for input_batch in chunkify(
                       ^^^^^^^^^
  File "/opt/app-root/lib64/python3.12/site-packages/docling/utils/utils.py", line 15, in chunkify
    for first in iterator:  # Take the first element from the iterator
                 ^^^^^^^^
  File "/opt/app-root/lib64/python3.12/site-packages/docling/datamodel/document.py", line 264, in docs
    yield InputDocument(
          ^^^^^^^^^^^^^^
  File "/opt/app-root/lib64/python3.12/site-packages/docling/datamodel/document.py", line 147, in __init__
    self._init_doc(backend, path_or_stream)
  File "/opt/app-root/lib64/python3.12/site-packages/docling/datamodel/document.py", line 183, in _init_doc
    self._backend = backend(self, path_or_stream=path_or_stream)
                    ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/opt/app-root/lib64/python3.12/site-packages/docling/backend/docling_parse_v4_backend.py", line 152, in __init__
    self.dp_doc: PdfDocument = self.parser.load(path_or_stream=self.path_or_stream)
                               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/opt/app-root/lib64/python3.12/site-packages/docling_parse/pdf_parser.py", line 458, in load
    hasher = hashlib.md5()
             ^^^^^^^^^^^^^
_hashlib.UnsupportedDigestmodError: [digital envelope routines] unsupported

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/opt/app-root/src/docling_serve/engines/async_local/worker.py", line 103, in loop
    response = await asyncio.to_thread(
               ^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/lib64/python3.12/asyncio/threads.py", line 25, in to_thread
    return await loop.run_in_executor(None, func_call)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/lib64/python3.12/concurrent/futures/thread.py", line 59, in run
    result = self.fn(*self.args, **self.kwargs)
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/opt/app-root/src/docling_serve/engines/async_local/worker.py", line 74, in run_conversion
    response = process_results(
               ^^^^^^^^^^^^^^^^
  File "/opt/app-root/src/docling_serve/response_preparation.py", line 145, in process_results
    raise HTTPException(status_code=500, detail=str(e))
fastapi.exceptions.HTTPException: 500: [digital envelope routines] unsupported

INFO:     100.100.0.45:54190 - "POST /v1alpha/convert/source HTTP/1.1" 404 Not Found

The issue seems to be hashlib.md5(), that fails when the image has GPU support.

I ended rebuilding the image with OpenSSL and no-fips, which worked.
However, this is not ideal, because the workflow becomes a bit complicated.
I have Mac, so I had to build the image on a VM and then push it to the correct container registry.

Any ideas about what might be wrong in the first place and how to resolve it..?

Metadata

Metadata

Assignees

Labels

No labels
No labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions