-
Notifications
You must be signed in to change notification settings - Fork 62
spark-connect-gpu: separate to client and server #597
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Greptile Overview
Greptile Summary
This PR restructures the spark-connect-gpu example by separating it into distinct client/ and server/ directories, improving deployment modularity and clarity.
Key Changes
- Architecture restructure: Split monolithic setup into client-server separation
server/directory: Contains Spark master, worker (GPU), connect server, and proxy servicesclient/directory: Contains Jupyter client that connects to server via external Docker network
- Docker Compose separation: Server docker-compose creates
spark-network, client references it as externalserver_spark-network - Documentation updates: Created separate READEs for client and server with focused setup instructions
- Demo optimization: Reduced data size in batch jobs from
2^35to2^12for faster execution - Notebook adjustment: Changed CSV output paths to local directory instead of
global_data_dir
The separation enables independent deployment of client and server components, better resource management, and clearer separation of concerns between GPU compute infrastructure and client workloads.
Confidence Score: 5/5
- This PR is safe to merge with minimal risk - it's a well-executed refactoring focused on improving project structure
- This is a clean refactoring that separates client and server concerns without changing core functionality. The changes are primarily organizational (file moves and renames) with minor improvements (typo fix needed, optimized demo sizes, corrected paths). No logical errors or security issues detected.
- examples/spark-connect-gpu/client/README.md has a minor typo that should be corrected
Important Files Changed
File Analysis
| Filename | Score | Overview |
|---|---|---|
| examples/spark-connect-gpu/server/docker-compose.yaml | 5/5 | Refactored to remove client service, keeping only server-side components (master, worker, connect-server, proxy) |
| examples/spark-connect-gpu/client/docker-compose.yaml | 5/5 | New client-only docker-compose that references external server network |
| examples/spark-connect-gpu/server/README.md | 5/5 | Updated documentation focusing on server setup, removed client-specific instructions |
| examples/spark-connect-gpu/client/README.md | 5/5 | New README documenting client setup with reference to server setup |
| examples/spark-connect-gpu/client/python/batch-job.py | 5/5 | Reduced data size from 2^35 to 2^12 for faster demo execution |
| examples/spark-connect-gpu/client/scala/src/main/scala/connect.scala | 5/5 | Reduced data size from 2^35 to 2^12 for faster demo execution |
Sequence Diagram
sequenceDiagram
participant User as User/Browser
participant Client as Jupyter Client<br/>(spark-connect-client)
participant Proxy as Nginx Proxy<br/>(proxy-service)
participant ConnectServer as Spark Connect Server<br/>(spark-connect-server)
participant Master as Spark Master<br/>(spark-master)
participant Worker as Spark Worker<br/>(spark-worker + GPU)
Note over User,Worker: Before: All services in single docker-compose
Note over Client,Worker: After: Separated into client/ and server/ directories
User->>Client: Access Jupyter (localhost:8888)
Client->>ConnectServer: gRPC Connect Request (sc://spark-connect-server)
ConnectServer->>Master: Request Resources
Master->>Worker: Assign Task
Worker->>Worker: Execute on GPU (RAPIDS)
Worker-->>ConnectServer: Return Results
ConnectServer-->>Client: Stream Results
Client-->>User: Display in Notebook
User->>Proxy: Access Web UIs (localhost:2080)
Proxy->>Master: Proxy to Master UI (8080)
Proxy->>Worker: Proxy to Worker UI (8081)
Proxy->>ConnectServer: Proxy to Driver UI (4040)
Note over Client,Worker: Client connects via external network:<br/>server_spark-network
27 files reviewed, 1 comment
Signed-off-by: Bobby Wang <[email protected]>
2f97d9e to
2f5749c
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Greptile Overview
Greptile Summary
This PR restructures the spark-connect-gpu example into a clean client-server architecture, separating concerns and improving deployment flexibility.
Key Changes:
- Split monolithic structure into
client/andserver/directories with separate docker-compose files - Server side includes spark-master, spark-worker (GPU-enabled), spark-connect-server, and nginx proxy
- Client side includes Jupyter Lab that connects to the server via external Docker network
- Added comprehensive documentation for both client and server setup
- Included Python, Scala batch job examples and NDS benchmark notebook
- Network configuration allows client to join server network via
server_spark-networkexternal reference
Architecture Benefits:
- Clear separation between GPU-requiring services (only spark-worker) and client applications
- Modular deployment - server and client can be managed independently
- Well-documented with detailed setup instructions and multiple access options (localhost vs container hostnames)
Confidence Score: 5/5
- This PR is safe to merge with no critical issues found
- The refactoring is well-structured with proper separation of concerns, comprehensive documentation, and appropriate Docker networking. The only issue is a typo already noted in previous comments. All configuration files are properly set up with correct RAPIDS integration and GPU resource allocation.
- No files require special attention
Important Files Changed
File Analysis
| Filename | Score | Overview |
|---|---|---|
| examples/spark-connect-gpu/client/docker-compose.yaml | 5/5 | New client-side docker-compose using external server network to connect to spark-connect-server |
| examples/spark-connect-gpu/server/docker-compose.yaml | 5/5 | Server-side docker-compose defining spark-master, spark-worker (GPU), spark-connect-server, and proxy-service |
| examples/spark-connect-gpu/client/README.md | 5/5 | New client documentation with setup instructions (has typo already noted in previous comments) |
| examples/spark-connect-gpu/server/README.md | 5/5 | Comprehensive server setup documentation with GPU prerequisites and network configuration options |
| examples/spark-connect-gpu/server/spark-connect-server/spark-defaults.conf | 5/5 | Spark configuration with GPU resource allocation and RAPIDS plugin settings |
Sequence Diagram
sequenceDiagram
participant User as User/Browser
participant Client as Spark Connect Client<br/>(Jupyter Lab)
participant Server as Spark Connect Server<br/>(gRPC)
participant Master as Spark Master
participant Worker as Spark Worker<br/>(GPU-enabled)
participant Proxy as Nginx Proxy
Note over User,Proxy: Server Setup (docker-compose up in server/)
Master->>Master: Start cluster coordination
Worker->>Master: Register worker with GPU resources
Server->>Master: Connect as Spark driver
Proxy->>Proxy: Start HTTP proxy on port 2080
Note over User,Proxy: Client Setup (docker-compose up in client/)
Client->>Client: Start Jupyter Lab on port 8888
Client->>Client: Join server_spark-network
Note over User,Worker: User Interaction
User->>Client: Access Jupyter Lab (localhost:8888)
User->>Client: Run notebook/Python/Scala code
Client->>Server: Connect via sc://spark-connect-server (gRPC)
Server->>Master: Submit Spark job
Master->>Worker: Schedule tasks with GPU resources
Worker->>Worker: Execute on GPU with RAPIDS
Worker->>Server: Return results
Server->>Client: Stream results via gRPC
Client->>User: Display results in notebook
Note over User,Proxy: Monitoring
User->>Proxy: Access WebUIs (via localhost or proxy)
Proxy->>Master: Forward requests to spark-master:8080
Proxy->>Worker: Forward requests to spark-worker:8081
Proxy->>Server: Forward requests to spark-connect-server:4040
27 files reviewed, no comments
|
Hi @eordentlich @gerashegalov, Would you please help review this PR, thx very much. |
Co-authored-by: greptile-apps[bot] <165735046+greptile-apps[bot]@users.noreply.github.com>
gerashegalov
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Greptile Overview
Greptile Summary
Refactors the spark-connect-gpu example to separate client and server components into distinct directories (client/ and server/), improving organization and clarity of the architecture.
- Moves server-side components (spark-master, spark-worker, spark-connect-server, proxy-service) to
server/directory - Moves client-side components (jupyter lab, notebooks, examples) to
client/directory - Creates separate README files for client and server with focused documentation
- Updates client
docker-compose.yamlto reference external server network - Maintains all existing functionality while improving project structure
Issues Found:
- Client README incorrectly states "four Docker services" when only describing two components (already has a suggested fix from previous review for typo "fount" → "found")
Confidence Score: 4/5
- This PR is safe to merge with minimal risk - it's primarily a refactoring that reorganizes files without changing functionality
- Score reflects that this is a straightforward organizational refactoring that moves files to improve structure. The only issues found are documentation inconsistencies (incorrect service count and typo) that don't affect code execution. All file moves appear complete and the docker-compose configurations correctly reference the new structure.
- Fix the documentation inconsistency in
examples/spark-connect-gpu/client/README.md(line 8) and the typo on line 12
Important Files Changed
File Analysis
| Filename | Score | Overview |
|---|---|---|
| examples/spark-connect-gpu/client/README.md | 3/5 | New client README with typo "fount" (already noted) and incorrect count of Docker services in architecture description |
Sequence Diagram
sequenceDiagram
participant User
participant Client as Spark Connect Client<br/>(Jupyter Lab)
participant Server as Spark Connect Server<br/>(gRPC)
participant Master as Spark Master
participant Worker as Spark Worker<br/>(GPU-enabled)
User->>Client: Run notebook/batch job
Client->>Server: Connect via sc://spark-connect-server
Server->>Master: Request cluster resources
Master->>Worker: Assign task
Worker->>Worker: Execute on GPU with RAPIDS
Worker-->>Master: Return results
Master-->>Server: Aggregate results
Server-->>Client: Return results via gRPC
Client-->>User: Display results
1 file reviewed, 1 comment
|
|
||
| ## 🏗️ Architecture | ||
|
|
||
| The setup consists of four Docker services: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
logic: Architecture section mentions "four Docker services" but only lists two (server and client side)
| The setup consists of four Docker services: | |
| The setup consists of two main components: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@wbo4958 please address greptile comments too
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed, thx for the review.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi @gerashegalov, please help review it again. thx very much.
This PR separates spark-connect-gpu into client and server structure.