Closed
Description
Describe the bug
I'm noticing below error with our Tabby deployment, looks like a memory error. Don't have any additional logs, since we've modified the logs to mask input, output information, this was needed for production deployment.
Process exit code was 1.
cmpl-dc7c656b-2a60-4276-8940-2a578d26e198: Generated 2 tokens in 56.007768ms at 35.709332319759646 tokens/s
cmpl-9c5e112f-5024-4d1b-a7b4-5a3f5dab21c2: Generated 2 tokens in 80.706173ms at 24.781251862853164 tokens/s
2024-03-11T23:00:58.450411Z ERROR llama_cpp_bindings::llama: crates/llama-cpp-bindings/src/llama.rs:78: Failed to step: _Map_base::at
Information about your version
0.5.5
Information about your GPU
+---------------------------------------------------------------------------------------+
| NVIDIA-SMI 535.154.05 Driver Version: 535.154.05 CUDA Version: 12.2 |
|-----------------------------------------+----------------------+----------------------+
| GPU Name Persistence-M | Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap | Memory-Usage | GPU-Util Compute M. |
| | | MIG M. |
...
...
...
| 3 NVIDIA A100 80GB PCIe On | 00000000:E3:00.0 Off | 0 |
| N/A 44C P0 74W / 300W | 18141MiB / 81920MiB | 0% E. Process |
| | | Disabled |
+-----------------------------------------+----------------------+----------------------+