Comparing changes

Fix the following runtime error with --no-use_cuda_graph option Traceback (most recent call last): File "/home/aubrey/miniforge3/envs/kt/lib/python3.11/multiprocessing/process.py", line 314, in _bootstrap self.run() File "/home/aubrey/miniforge3/envs/kt/lib/python3.11/multiprocessing/process.py", line 108, in run self._target(*self._args, **self._kwargs) File "/home/aubrey/miniforge3/envs/kt/lib/python3.11/site-packages/ktransformers/server/backend/interfaces/balance_serve.py", line 282, in run_engine engine.loop() File "/home/aubrey/miniforge3/envs/kt/lib/python3.11/site-packages/ktransformers/server/backend/interfaces/balance_serve.py", line 234, in loop self.model_runner.run(self.batch, self.query_manager) File "/home/aubrey/miniforge3/envs/kt/lib/python3.11/site-packages/ktransformers/server/balance_serve/inference/model_runner.py", line 220, in run self.output.logits[0] = self.output.logits[0][self.input[cuda_graph_idx].minibatch.logits_start] ~~~~~~~~~~~~~~~~~~~~~^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ RuntimeError: CUDA error: an illegal memory access was encountered CUDA kernel errors might be asynchronously reported at some other API call, so the stacktrace below might be incorrect.

VLinearMarlin: padding to input.shape[0] to avoid CUDA error

Update version

Fix NaN bug

add XPU support for qwen3moe local chat

- Add Dockerfile.xpu for oneAPI-based container - Create Docker_xpu.md with usage instructions - Update xpu.md to include Docker guide

Add Dockerfile and usage guide for XPU support

* display the unavailable torch device on error * Raise exception on device error --------- Signed-off-by: Emmanuel Ferdman <[email protected]>

Commits on May 22, 2025

add XPU support for qwen3moe local chat

rnwang04 committed May 22, 2025

Configuration menu

View commit details

Copy full SHA for adc0906

Browse repository at this point

Copy the full SHA

adc0906 View commit details

Browse the repository at this point in the history

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Comparing changes

Open a pull request

Commits on May 18, 2025

Commits on May 19, 2025

Commits on May 21, 2025

Commits on May 22, 2025

Commits on May 23, 2025

Commits on May 28, 2025

Commits on May 29, 2025

This comparison is taking too long to generate.

Uh oh!