[QUESTION] About applying chat template for base model via `clone_chat_template` from trl

In the course [Supervised Fine-Tuning](https://huggingface.co/learn/smol-course/unit1/3), author uses base model `HuggingFaceTB/SmolLM3-3B-Base` but I choose `HuggingFaceTB/SmolLM2-135M` because it is lighter. However, I found that the base model `SmolLM2-135M` does not have its own chat template but it already had special tokens. However, speical tokens may be incorrect, for example, bos_token and eos_token share the same token `<|endoftext|>`

<img width="654" height="305" alt="Image" src="https://github.com/user-attachments/assets/87a4cea8-c372-4540-b617-9c41825f5a7e" />

I also refer to course [LLM Course, Fine-Tuning with SFTTrainer](https://huggingface.co/learn/llm-course/en/chapter11/3?fw=pt#implementation-with-trl) and author uses `setup_chat_format` to create the chat template for base model's tokenizer which does not have its own chat template

However, [`setup_chat_format`](https://github.com/huggingface/trl/blob/86f74b486fda475e5530a451d06b835361d959ac/trl/models/utils.py#L87) only supports `chatml` format and will be deprecated in trl version 0.26.0. That is why I use [`clone_chat_template`](https://github.com/huggingface/trl/blob/86f74b486fda475e5530a451d06b835361d959ac/trl/models/utils.py#L165) instead.

But another issue appears here: while `clone_chat_template` only overwrites eos from source tokenizer to target tokenizer, the `setup_chat_format` overwrites all bos, eos, and pad tokens. After I try to clone `Llama-3.2-Instruct`'s chat template, only eos changes to `<|eot_id|>`

`model, tokenizer, added_tokens = clone_chat_template(model=model, tokenizer=tokenizer, source_tokenizer_path='meta-llama/Llama-3.2-1B-Instruct')`

<img width="633" height="186" alt="Image" src="https://github.com/user-attachments/assets/4428af4d-b8d8-4974-893f-af4033d516ed" />

Question:
1. Why in the base model, although the tokenizer does not have a chat template, it already has special tokens?
2. `clone_chat_template` does not overwrite all special tokens like bos, eos, pad, ... so are there any training SFT impacts, and what is the solution for this?

I am new to SFT and I very appreciate any support. Thank you.
 

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[QUESTION] About applying chat template for base model via `clone_chat_template` from trl #248

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

[QUESTION] About applying chat template for base model via clone_chat_template from trl #248

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions

[QUESTION] About applying chat template for base model via `clone_chat_template` from trl #248