[Bug]: Use much more VRAM  to chat with Ollama models than running in cmd

### Self Checks

- [x] I have searched for existing issues [search for existing issues](https://github.com/infiniflow/ragflow/issues), including closed ones.
- [x] I confirm that I am using English to submit this report ([Language Policy](https://github.com/infiniflow/ragflow/issues/5910)).
- [x] Non-english title submitions will be closed directly ( 非英文标题的提交将会被直接关闭 ) ([Language Policy](https://github.com/infiniflow/ragflow/issues/5910)).
- [x] Please do not modify this template :) and fill in all the required fields.

### RAGFlow workspace code commit ID

.

### RAGFlow image version

nightly

### Other environment information

```Markdown

```

### Actual behavior

when i excute `ollama run qwen3:32b`, and `ollama ps`
it shows 45g VRAM occupied.
But when i try bind this model to ragflow, it shows need 71g VRAM and results 30% parameters loaded on CPU
I was thinking its ollama problem, however I uninstall and install it with many versions, there still is this issue.
I dont know whether it is because of ragflow.
Could you clarify how ragflow invokes ollama server?

### Expected behavior

_No response_

### Steps to reproduce

```Markdown
fail to bind qwen3:32b to ragflow although run it smoothly in cmd
```

### Additional information

_No response_

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[Bug]: Use much more VRAM to chat with Ollama models than running in cmd #7981

Self Checks

RAGFlow workspace code commit ID

RAGFlow image version

Other environment information

Actual behavior

Expected behavior

Steps to reproduce

Additional information

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

[Bug]: Use much more VRAM to chat with Ollama models than running in cmd #7981

Description

Self Checks

RAGFlow workspace code commit ID

RAGFlow image version

Other environment information

Actual behavior

Expected behavior

Steps to reproduce

Additional information

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions