Skip to content

Abnormal Output from Gemma Pretrained Model After Conversion to Hugging Face Format #1762

Open
@SKNahin

Description

@SKNahin

Bug description

I have trained gemma base model with custom data. After training I have converted the pretrained checkpoint to litgpt. This was the command.
litgpt convert_pretrained_checkpoint my_pretrained_checkpoint litgpt_checkpoint

After that I have tested the model with - litgpt chat litgpt_checkpoint . With this command the model works fine and the generation quality was excellent.

Then I converted the litgpt checkpoint to hf checkpoint with this command - litgpt convert_from_litgpt litgpt_checkpoint hf_checkpoint. It saves a model.pth file in hf_checkpoint directory. I loaded the .pth file and loaded in huggingface model. But when I tested the model the generation was random this time. Here is the code -

from transformers import Gemma2ForCausalLM, AutoTokenizer


state_dict = torch.load("hf_checkpoint/model.pth")
model = Gemma2ForCausalLM.from_pretrained("google/gemma-2-2b", local_files_only=True, state_dict=state_dict )
tokenizer = AutoTokenizer.from_pretrained("google/gemma-2-2b")

from transformers import pipeline
pipe = pipeline("text-generation", model=model, tokenizer=tokenizer)
print(pipe("আমাদের দেশের"))

The output is - [{'generated_text': 'আমাদের দেশেরinninninninninninninninninninninninninn'}]

I'm not sure if I'm missing something. Can anyone help with converting the pretrained checkpoint?

What operating system are you using?

Linux

LitGPT Version

litgpt                   0.4.12
transformers             4.44.2
torch                    2.4.1

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't workinghelp wantedExtra attention is needed

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions