[CUDA backend ONLY] Use just K-cache for MLA + FA: 47% saving on KV-cache size #13529
+37
−10
We went looking everywhere, but couldn’t find those commits.
Sometimes commits can disappear after a force-push. Head back to the latest changes here.