Skip to content

[LLM] support multi node deploy #2708

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 7 commits into from
Jul 6, 2025
Merged

Conversation

ltd0924
Copy link
Collaborator

@ltd0924 ltd0924 commented Jul 4, 2025

Multi-Machine Deployment

This system supports distributed deployment across multiple nodes. To configure:

pod-ips Parameter

  • Type: string
  • Format: Comma-separated IPv4 addresses
  • Description: Specifies all node IPs in the deployment pod
  • Required: For multi-node deployments only
  • Example: "192.168.1.101,192.168.1.102,192.168.1.103"

Deployment Workflow

  1. Parameter passing:
    • Concatenate all node IPs in the deployment cluster
    • Pass as a single string using the pod-ips parameter
  2. Unified Node Startup:
    • Same launch command for all nodes (master and workers)
    • Nodes self-identify roles based on IP position in pod-ips
     python fastdeploy.entrypoints.openai.api_server \
     --pod-ips "IP1,IP2,IP3,...,IPN"
    

Usage

  1. Only support sending request to the master node (the first ip of the pod-ips parameter)

Copy link

paddle-bot bot commented Jul 4, 2025

Thanks for your contribution!

@Jiang-Jia-Jun
Copy link
Collaborator

Related issue: #2649

@Jiang-Jia-Jun Jiang-Jia-Jun merged commit 68b4755 into PaddlePaddle:develop Jul 6, 2025
3 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants