SANA-Sprint Student Training

Hi @lawrence-cj,

Thanks for your great work. I have tried the SANA-Sprint training code (with initialized student from SANA-Sprint Teacher 0.6B) using both implementation versions: the [original SANA repo](https://github.com/NVlabs/Sana/blob/main/train_scripts/train_scm_ladd.py), and [diffusers training version](https://github.com/huggingface/diffusers/blob/main/examples/research_projects/sana/train_sana_sprint_diffusers.py). Both experiments failed when the student prediction is quite noisy and does not produce high quality images. I didn't modify anything from the codebase except using a different training image-text pairs dataset with approximately 500k samples (with fixed resolution 1024x1024). The student model was trained in roughly 50 epochs with batch size = 8 and below is the output validation results:

<img width="720" height="288" alt="Image" src="https://github.com/user-attachments/assets/1613eb6d-1f70-42d4-a59d-e8373b3e3760" />

Questions: Do you have any tips or tricks during the training process? And do you think that the model still needs more training iterations to better converge? Appreciate your help!

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

SANA-Sprint Student Training #300

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

SANA-Sprint Student Training #300

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions