Handling Multiple Conversations and Prompts in Custom Main Loop #14520
Unanswered
MirkoDeVita98
asked this question in
Q&A
Replies: 1 comment 4 replies
-
I only skimmed through your code, but generally speaking if you want to process multiple conversations in the same KV cache, you need to assign each one a different sequence id. That's the |
Beta Was this translation helpful? Give feedback.
4 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
-
Hello! I'm new to llama.cpp and I'm trying to understand how to write my own main loop to process multiple conversations from a file.
This is my current setup: I loop over each conversation (outer loop), and then iterate through the prompts within each conversation (inner loop). While transitioning between conversations works fine, I'm seeing strange results when processing multiple prompts within the same conversation.
I based parts of this code on the save-load-state.cpp example.
Could you please take a look and let me know if my logic makes sense? Also, I'm unsure whether I should instantiate a new context and sampler for every conversation, or if there's a more efficient way to handle that. Any advice on best practices would be appreciated!
Beta Was this translation helpful? Give feedback.
All reactions