Unable to use lora in llamasharp but can use it in llama.cpp #618

pugafran · 2024-03-19T09:38:24Z

I generated my own lora adapters using the finetune executable from the llama.cpp repository, when I tried to use them in llama.cpp using .bin it works, but the .gguf returns "bad file magic". The thing is that in llamasharp, the .gguf tells me the same thing, "bad file magic", but if I try to load the bin it gives an error of protected access in memory.

I used codellama-7b.Q8_0.gguf and codellama-7b-instruct.Q4_K_S.gguf as models to generate that adapters, I would very much like to be able to use the lora adapters.

I didn't see documentation on how to implement it so I did some freestyle decompiling in Visual Studio but I want to believe I'm doing it right:

AdapterCollection adapters = new AdapterCollection();
adapters.Add(new LoraAdapter("..\\..\\..\\..\\..\\CoPilot\\data\\lora.bin", 1.0f));

            var parameters = new ModelParams(modelPath)
            {
                LoraAdapters = adapters,
                ContextSize = 2048,
                Seed = 1337,
                GpuLayerCount = 15,
                EmbeddingMode = true,
                Threads = (uint)(Environment.ProcessorCount * 0.7)
            };

The text was updated successfully, but these errors were encountered:

pugafran · 2024-03-19T10:08:15Z

Perhaps it is the same as in #566? The error is the same as when I load .bin but the models I use are from codellama, so I don't know.

martindevans · 2024-03-19T14:03:48Z

.gguf returns "bad file magic"

The "file magic" is a very simple sanity check that the file is the right format, it just checks that the first 4 bytes are the file are the expected "magic number". If you're getting this error it probably means your gguf files are malformed.

if I try to load the bin it gives an error of protected access in memory.

I'm not sure about this, generally .bin indicates that you're using the wrong file type. The protected access violation is a pretty generic error, but llama.cpp often throws this if you pass in bad arguments and it doesn't notice (e.g. the file magic is correct, but the rest of the file is nonsense).

Code sample

That looks reasonable to me, just a couple of small things (that probably aren't relevant to your issue):

EmbeddingMode: true I'm not sure if you need this on. You need it on if you're going to be embedding things using this model, but I think codellama is for generation not embedding?

This is valid syntax that allows you to init the adapters collection inline. Same thing as what you wrote, just a bit more compact:

var example = new ModelParams("whatever.gguf")
{
    LoraAdapters = {
        new LoraAdapter("example.gguf", 1.0f),
    }
};

pugafran · 2024-03-20T08:19:15Z

I changed it with the code you told me but it still doesn't work as expected. Yes, I generate embeddings of user questions to compare.

I don't know why the .bin works in llama.cpp but not in llamasharp, the problem comes from this function:

SignalRT · 2024-03-20T22:45:42Z

@pugafran The problem seems to be in LlamaSharp, but I don´t yet understand the reason.

llama.cpp

I finetune the LlamaSharp example model: llama-2-7b-chat.Q4_0.gguf with an example dataset.
I test the LORA with llama.cpp. It loads without problems.

In the case of LlamaShap it crash on the moment that apply lora. I don´t have yet a reason.

While searching for a reason, You could use the tool export-lora from llama.cpp to build a resulting model from the base model + lora.

martindevans · 2024-03-20T23:06:20Z

Are you using the same version of llama.cpp as the binaries in LLamaSharp? It's unlikely but possible there's an incompatibility in the file format.

pugafran · 2024-03-21T07:43:03Z

Yes, Im even using the exact commit for LlamaSharp 0.10:

blueskyscorpio · 2024-03-22T06:40:24Z

I met the same problem when I load bin in llamasharp.

SignalRT · 2024-03-31T08:30:02Z

@blueskyscorpio , Your problem seems to be different. You are trying to load a bin file not a gguf file that is the supported format.

You need to load a supported model on gguf format. (see https://github.com/ggerganov/llama.cpp?tab=readme-ov-file#description Supported Models).

github-actions · 2025-05-05T00:33:56Z

This issue has been automatically marked as stale due to inactivity. If no further activity occurs, it will be closed in 7 days.

martindevans added the question Further information is requested label Mar 19, 2024

github-actions bot added the stale Stale issue will be autoclosed soon label May 5, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Unable to use lora in llamasharp but can use it in llama.cpp #618

Unable to use lora in llamasharp but can use it in llama.cpp #618

pugafran commented Mar 19, 2024 •

edited

Loading

pugafran commented Mar 19, 2024

martindevans commented Mar 19, 2024

pugafran commented Mar 20, 2024 •

edited

Loading

SignalRT commented Mar 20, 2024

martindevans commented Mar 20, 2024

pugafran commented Mar 21, 2024

blueskyscorpio commented Mar 22, 2024

SignalRT commented Mar 31, 2024

github-actions bot commented May 5, 2025

Unable to use lora in llamasharp but can use it in llama.cpp #618

Unable to use lora in llamasharp but can use it in llama.cpp #618

Comments

pugafran commented Mar 19, 2024 • edited Loading

pugafran commented Mar 19, 2024

martindevans commented Mar 19, 2024

pugafran commented Mar 20, 2024 • edited Loading

SignalRT commented Mar 20, 2024

martindevans commented Mar 20, 2024

pugafran commented Mar 21, 2024

blueskyscorpio commented Mar 22, 2024

SignalRT commented Mar 31, 2024

github-actions bot commented May 5, 2025

pugafran commented Mar 19, 2024 •

edited

Loading

pugafran commented Mar 20, 2024 •

edited

Loading