Skip to content

[Executorch][llm] Enable leveraging ring kv cache via module swap #10611

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 8 commits into from
May 13, 2025
Merged
Prev Previous commit
Update on "[Executorch][llm] Enable leveraging ring kv cache via modu…
…le swap"

This allows us to make some of the attention modules to use sliding window kv cache. Will help enable models like gemma3.

Differential Revision: [D73891426](https://our.internmc.facebook.com/intern/diff/D73891426/)

[ghstack-poisoned]
  • Loading branch information
kimishpatel committed May 12, 2025
commit 664a6382fc60bf400f6412f2e6b2fc6146e9e1df

This merge commit was added into this branch cleanly.

There are no new changes to show, but you can still view the diff.