Pulse · triton-inference-server/server · GitHub

June 28, 2025 – July 5, 2025

Overview

6 Active pull requests

6 Active issues

5 Pull requests merged by 3 people

feat: Add support for usage in the OpenAI frontend vLLM backend
#8264 merged Jul 3, 2025
fix: Improve validation for system shared memory register
#8273 merged Jul 2, 2025
TPRD-1590: OpenVINO 2025.2.0 version updated
#8277 merged Jul 1, 2025
fix: Improve data type validation for classification
#8267 merged Jul 1, 2025
test: Remove shared memory key from the error response
#8269 merged Jun 30, 2025

1 Pull request opened by 1 person

draft: Add testing for explicit model load
#8276 opened Jul 1, 2025

1 Issue closed by 1 person

Windows Release 25.02 missing
#8263 closed Jun 30, 2025

5 Issues opened by 5 people

Unable to build image with python, dali, onnxruntime, ensemble backend
#8280 opened Jul 2, 2025
ERROR: Detected Tesla V100S-PCIE-32GB GPU, which is not supported by this container
#8279 opened Jul 2, 2025
Support allow_ragged_batch flag in python backend auto_complete_config
#8278 opened Jul 1, 2025
Accelerate TensorRT Engine Loading with a High-Performance File Streamer SDK
#8275 opened Jun 30, 2025
5090 trying to run Triton from Tutorials or in any way before moving to L40s.
#8274 opened Jun 30, 2025