-
Notifications
You must be signed in to change notification settings - Fork 423
Failed to eval - potential bug but hard to diagnose #422
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
To help debug this, I'd suggest trying to add something into here which logs out the value of Unfortunately llama.cpp doesn't return very detailed error codes, but that might help us work out the root issue. e.g. var ret = NativeApi.llama_eval(this, pinned, tokens.Length, n_past);
if (ret != 0)
Console.WriteLine($"[DEBUGGING] ret == {ret}");
return ret == 0; |
That returned ret == 1. Not sure if that is any use to you.... I did add some more information, to see if it helped reveal anything... Console.WriteLine($"[DEBUGGING]: test ret == {ret} {tokens.Length} and {n_past}"); This is it when it succeeds:
I do think sometimes it fails for the Token Limit, but then I have this failure:
On the 2,809 token text. Looking at n_past:
I know that my prompt is 230 tokens and my question is about 90 and the response has a limit of 250....this is like 600 at most which only gets us to 3,400. I know that my "grammar" is another 160 tokens. Which leaves us well under 400 short of the limit. Not sure why it is doing what it is, but hopefully the above helps. |
The method to reference for these error code is this one. Interestingly the docs on that method say:
So your return value of The code returning if (!llama_kv_cache_find_slot(kv_self, batch)) {
return 1;
} So it looks like, for some reason, your cache is filling up and so eval fails. I'm not really sure how to debug that further at the moment :/ |
I had a look at the code and implemented the NativeApi log that you provided in another thread - and I don't see anything being returned on a failure. Not under any level of logging. All you get back is the "Failed to eval" message - I did note that one of the methods returns a Log Error - and one of them does not - and these are the only two ways to get a false result from the Not sure whether my logging implementation would have caught the Which leads me to think that the problem is perhaps with this method, where the error message is commented out - and thus returns nothing even under the NativeApi logging?
I did a bit more work in reporting by getting all the tokens, including the 250 for the response - and seeing where we were when we failed: Which I think demonstrates that we are far short of the Context length sometimes so it probably isn't the first method that is n_tokens > n_ctx. That is as far as I have got in my digging. |
Just to check: I created the logging like this:
Implement it at the start with:
This only returns INFO logging - I deliberately put in text that was over the Context - still received no error logging. I am assuming that I have implemented the logging incorrectly, but it isn't obvious to me what I have missed. |
Yep I'm thinking it's this too, this is basically as far as I got in my digging.
As far as I can see your logging setup looks fine and you should be receiving error level events. There's no filtering going on on the C# side of things there. |
This issue has been automatically marked as stale due to inactivity. If no further activity occurs, it will be closed in 7 days. |
I used the Grammar Example and created a little program for myself where I iterate over text in a file and ask a question of that text. Please note I have anonymised some of it so file paths might not make sense in the below...
This all works pretty well, I am very impressed - just occasionally, I will get a Failed to Eval...and it doesn't really provide enough information to know why it did that. At one point, I thought I had narrowed it down to a Token Size limitation breach - and I do think you get this message if you breach the token limit - but that isn't the only reason?
Using MemoryLock / MemoryMap or not doesn't change the fact that this bug appears.
Occasionally I will get a "Failed to eval." this is the output of that error:
Here is an example where I do not think the text has breached the limit but I have received the message. This is 2,809 on a 4,096 Context window model:
The text was updated successfully, but these errors were encountered: