Question about prompt

I brought over an issue from Hugging Face, and I have the same problem.

> Great ideas in the paper. I have one question about the checkpoint: you start from qwen base, which has no reasoning or instruction tuning, right? Additionally it seems like you have no supervised step to illicit initial reasoning. I am wondering how the model learns how to use the tokens. What am I missing? Did you use any elaborate prompt templates? If so, how do they look like? I couldn't find any info on that in the paper

@ahatamiz Could you please provide some ideas?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Question about prompt #5

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Question about prompt #5

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions