Are there any plans to support true tensor parallelism in the future? #13013
lingyezhixing
started this conversation in
General
Replies: 0 comments
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
The current performance overhead for multi-GPU inference is substantial. What optimization methods are available?
Beta Was this translation helpful? Give feedback.
All reactions