-
Notifications
You must be signed in to change notification settings - Fork 545
Insights: PaddlePaddle/FastDeploy
Overview
Could not load contribution data
Please try again later
40 Pull requests merged by 22 people
-
【Fearture】support qwen2 some func
#2740 merged
Jul 8, 2025 -
[SOT] Remove BreakGraph with
paddle.maximum
#2731 merged
Jul 8, 2025 -
[Bug fix] fix complie bug when sm < 89
#2738 merged
Jul 8, 2025 -
[Optimize] Optimize tensorwise fp8 performance
#2729 merged
Jul 7, 2025 -
[iluvatar_gpu] Adapt for iluvatar gpu
#2684 merged
Jul 7, 2025 -
support FastDeploy version setting
#2725 merged
Jul 7, 2025 -
remove redundant install whl of fastdeploy
#2726 merged
Jul 7, 2025 -
[RL] Check if the controller port is available
#2724 merged
Jul 7, 2025 -
[Doc]Update eb45-0.3B minimum memory requirement
#2686 merged
Jul 7, 2025 -
[LLM] support multi node deploy
#2708 merged
Jul 6, 2025 -
修改XPU CI, test=model
#2721 merged
Jul 6, 2025 -
fix bug. (#2718)
#2720 merged
Jul 5, 2025 -
fix bug.
#2718 merged
Jul 5, 2025 -
spec token map lazy.
#2715 merged
Jul 4, 2025 -
[BugFix] fix paddle_git_commit_id error
#2714 merged
Jul 4, 2025 -
add support QWQ enable_thinking
#2706 merged
Jul 4, 2025 -
[CI] Add validation for MTP and CUDAGraph
#2710 merged
Jul 4, 2025 -
添加XPU CI, test=model
#2701 merged
Jul 4, 2025 -
Extract eh_proj Layer from ParallelLMHead for MTP to Avoid Weight Transposition Issue
#2707 merged
Jul 4, 2025 -
[feature]add fd whl version info
#2698 merged
Jul 4, 2025 -
[RL] update reschedule finish reason
#2709 merged
Jul 4, 2025 -
[MTP] Support chunked_prefill in speculative decoding(MTP)
#2705 merged
Jul 4, 2025 -
[Doc] modify reasoning_output docs
#2696 merged
Jul 4, 2025 -
add quick benchmark script
#2703 merged
Jul 4, 2025 -
[feat] support fa3 backend for pd disaggregated
#2695 merged
Jul 3, 2025 -
[Bug] fix logger format
#2689 merged
Jul 3, 2025 -
[doc] update docs
#2692 merged
Jul 3, 2025 -
[doc] update docs
#2690 merged
Jul 3, 2025 -
[Sync] Update to latest code
#2679 merged
Jul 3, 2025 -
add --force-reinstall --no-cache-dir when pip install fastdeploy*.whl
#2682 merged
Jul 2, 2025 -
Update gh-pages.yml
#2680 merged
Jul 2, 2025 -
add wint2 performance
#2673 merged
Jul 2, 2025 -
Update CI test cases
#2671 merged
Jul 2, 2025 -
update iluvatar gpu fastdeploy whl
#2675 merged
Jul 2, 2025 -
fix ci.yml
#2665 merged
Jul 1, 2025 -
【Inference Optimize】Support ERNIE-4_5-300B-A47B-2BITS-Paddle model TP2/TP4 Inference
#2666 merged
Jul 1, 2025 -
【Docs】fix speculative docs
#2669 merged
Jul 1, 2025 -
Update kunlunxin_xpu.md
#2662 merged
Jul 1, 2025 -
【Update Doc】update quantization doc
#2659 merged
Jul 1, 2025 -
Update kunlunxin_xpu.md
#2657 merged
Jul 1, 2025
17 Pull requests opened by 16 people
-
[WIP] optimzie wint2 moe_group_gemm.
#2661 opened
Jul 1, 2025 -
Feat/blackwell sm100 support
#2670 opened
Jul 1, 2025 -
update iluvatar gpu fastdeploy whl
#2674 opened
Jul 2, 2025 -
Add with_output version AppendAttention
#2694 opened
Jul 3, 2025 -
[GCU] Support gcu platform
#2702 opened
Jul 3, 2025 -
[feat]add loadtimequantization modelloader
#2711 opened
Jul 4, 2025 -
[Stop Sequences] support stop sequences
#2712 opened
Jul 4, 2025 -
[RL Feature] add rl qwen model support
#2713 opened
Jul 4, 2025 -
Support use safetensors with paddle.MmapStorage to load model files
#2730 opened
Jul 7, 2025 -
add precision check for ci
#2732 opened
Jul 7, 2025 -
[SOT] Make custom_op dy&st unified
#2733 opened
Jul 7, 2025 -
[draft] change rejectionsampling topk=40
#2734 opened
Jul 7, 2025 -
[SOT] Enable SOT Dy2St in Multimodal Model
#2735 opened
Jul 7, 2025 -
[Bug fix] Fixed the garbled text issues in Qwen3-8B
#2737 opened
Jul 7, 2025 -
Opt wint2
#2741 opened
Jul 8, 2025 -
[Bug fix] fix the missing position args in expert_service.py
#2742 opened
Jul 8, 2025 -
[Bug fix] fix attention rank init
#2743 opened
Jul 8, 2025
6 Issues closed by 6 people
-
基于FastDeploy运行ernie-4.5-vl,在OpenAI的配置里[enable_thingking]参数不生效
#2727 closed
Jul 7, 2025 -
FD运行PP-Vehicle模型推理结果与PaddleDetection运行模型推理结果不同
#2681 closed
Jul 2, 2025 -
Support for CUDA 12.8 / Blackwell SM120
#2656 closed
Jul 2, 2025 -
官网docker镜像作离线推理加载模型到94%时失败,可能跟libnvidia-ml相关
#2667 closed
Jul 1, 2025 -
P800 docker run报错
#2660 closed
Jul 1, 2025 -
fastdeploy-2.0.0a0 版本仅兼容 Paddle-3.1 么?
#2658 closed
Jul 1, 2025
13 Issues opened by 12 people
-
ERNIE-4.5-VL-28B-A3B-Paddle 加载卡主不动,无论是单卡4090 48b还是双卡4090 48g都不行
#2739 opened
Jul 7, 2025 -
ERNIE-4.5-VL-424B-A47B-Paddle加载卡住不动
#2723 opened
Jul 6, 2025 -
OpenAI接口兼容性不佳以及一些其他问题
#2722 opened
Jul 5, 2025 -
fastdeploy 部署erine-21B
#2704 opened
Jul 4, 2025 -
Feature Request: Add Support for max_completion_tokens Parameter (OpenAI API Deprecation)
#2697 opened
Jul 3, 2025 -
Feature Request: FastDeploy Architecture Overview
#2691 opened
Jul 3, 2025 -
8卡 h200 部署ERNIE-4.5-VL-424B-A47B-Paddle 失败
#2683 opened
Jul 2, 2025 -
ERNIE-4.5-300B-A47B-2Bits-Paddle 双卡部署报错
#2678 opened
Jul 2, 2025 -
一键编译FastDeploy时报错
#2676 opened
Jul 2, 2025 -
how to get logprobs when deploy a openai format server
#2672 opened
Jul 1, 2025 -
官网docker镜像作离线推理加载模型到94%时失败,可能跟libnvidia-ml相关
#2668 opened
Jul 1, 2025 -
ERNIE-4.5-VL-28B-A3B-Paddle的int4量化加载,4090单卡成功,双卡失败
#2663 opened
Jul 1, 2025
2 Unresolved conversations
Sometimes conversations happen on old items that aren’t yet closed. Here is a list of all the Issues and Pull Requests with unresolved conversations.
-
启动失败
#2655 commented on
Jul 1, 2025 • 0 new comments -
使用官网镜像+操作步骤启动报错
#2651 commented on
Jul 1, 2025 • 0 new comments