Open
Description
Self Checks
- I have searched for existing issues search for existing issues, including closed ones.
- I confirm that I am using English to submit this report (Language Policy).
- Non-english title submitions will be closed directly ( 非英文标题的提交将会被直接关闭 ) (Language Policy).
- Please do not modify this template :) and fill in all the required fields.
RAGFlow workspace code commit ID
.
RAGFlow image version
nightly
Other environment information
Actual behavior
when i excute ollama run qwen3:32b
, and ollama ps
it shows 45g VRAM occupied.
But when i try bind this model to ragflow, it shows need 71g VRAM and results 30% parameters loaded on CPU
I was thinking its ollama problem, however I uninstall and install it with many versions, there still is this issue.
I dont know whether it is because of ragflow.
Could you clarify how ragflow invokes ollama server?
Expected behavior
No response
Steps to reproduce
fail to bind qwen3:32b to ragflow although run it smoothly in cmd
Additional information
No response