Skip to content

Unable to use lora in llamasharp but can use it in llama.cpp #618

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
pugafran opened this issue Mar 19, 2024 · 9 comments
Open

Unable to use lora in llamasharp but can use it in llama.cpp #618

pugafran opened this issue Mar 19, 2024 · 9 comments
Labels
question Further information is requested stale Stale issue will be autoclosed soon

Comments

@pugafran
Copy link

pugafran commented Mar 19, 2024

I generated my own lora adapters using the finetune executable from the llama.cpp repository, when I tried to use them in llama.cpp using .bin it works, but the .gguf returns "bad file magic". The thing is that in llamasharp, the .gguf tells me the same thing, "bad file magic", but if I try to load the bin it gives an error of protected access in memory.

I used codellama-7b.Q8_0.gguf and codellama-7b-instruct.Q4_K_S.gguf as models to generate that adapters, I would very much like to be able to use the lora adapters.

I didn't see documentation on how to implement it so I did some freestyle decompiling in Visual Studio but I want to believe I'm doing it right:

AdapterCollection adapters = new AdapterCollection();
adapters.Add(new LoraAdapter("..\\..\\..\\..\\..\\CoPilot\\data\\lora.bin", 1.0f));

            var parameters = new ModelParams(modelPath)
            {
                LoraAdapters = adapters,
                ContextSize = 2048,
                Seed = 1337,
                GpuLayerCount = 15,
                EmbeddingMode = true,
                Threads = (uint)(Environment.ProcessorCount * 0.7)
            };
@pugafran
Copy link
Author

Perhaps it is the same as in #566? The error is the same as when I load .bin but the models I use are from codellama, so I don't know.

@martindevans
Copy link
Member

.gguf returns "bad file magic"

The "file magic" is a very simple sanity check that the file is the right format, it just checks that the first 4 bytes are the file are the expected "magic number". If you're getting this error it probably means your gguf files are malformed.

if I try to load the bin it gives an error of protected access in memory.

I'm not sure about this, generally .bin indicates that you're using the wrong file type. The protected access violation is a pretty generic error, but llama.cpp often throws this if you pass in bad arguments and it doesn't notice (e.g. the file magic is correct, but the rest of the file is nonsense).

Code sample

That looks reasonable to me, just a couple of small things (that probably aren't relevant to your issue):

EmbeddingMode: true I'm not sure if you need this on. You need it on if you're going to be embedding things using this model, but I think codellama is for generation not embedding?

This is valid syntax that allows you to init the adapters collection inline. Same thing as what you wrote, just a bit more compact:

var example = new ModelParams("whatever.gguf")
{
    LoraAdapters = {
        new LoraAdapter("example.gguf", 1.0f),
    }
};

@martindevans martindevans added the question Further information is requested label Mar 19, 2024
@pugafran
Copy link
Author

pugafran commented Mar 20, 2024

I changed it with the code you told me but it still doesn't work as expected. Yes, I generate embeddings of user questions to compare.

I don't know why the .bin works in llama.cpp but not in llamasharp, the problem comes from this function:
image

image

image

@SignalRT
Copy link
Collaborator

@pugafran The problem seems to be in LlamaSharp, but I don´t yet understand the reason.

llama.cpp

  1. I finetune the LlamaSharp example model: llama-2-7b-chat.Q4_0.gguf with an example dataset.
  2. I test the LORA with llama.cpp. It loads without problems.

In the case of LlamaShap it crash on the moment that apply lora. I don´t have yet a reason.

While searching for a reason, You could use the tool export-lora from llama.cpp to build a resulting model from the base model + lora.

@martindevans
Copy link
Member

Are you using the same version of llama.cpp as the binaries in LLamaSharp? It's unlikely but possible there's an incompatibility in the file format.

@pugafran
Copy link
Author

Yes, Im even using the exact commit for LlamaSharp 0.10:

image
image
image

@blueskyscorpio
Copy link

I met the same problem when I load bin in llamasharp.
微信图片_20240322125826
微信图片_20240322125720

@SignalRT
Copy link
Collaborator

@blueskyscorpio , Your problem seems to be different. You are trying to load a bin file not a gguf file that is the supported format.

You need to load a supported model on gguf format. (see https://github.com/ggerganov/llama.cpp?tab=readme-ov-file#description Supported Models).

Copy link

github-actions bot commented May 5, 2025

This issue has been automatically marked as stale due to inactivity. If no further activity occurs, it will be closed in 7 days.

@github-actions github-actions bot added the stale Stale issue will be autoclosed soon label May 5, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
question Further information is requested stale Stale issue will be autoclosed soon
Projects
None yet
Development

No branches or pull requests

4 participants