Skip to content

SANA-Sprint Student Training #300

@nttung1110

Description

@nttung1110

Hi @lawrence-cj,

Thanks for your great work. I have tried the SANA-Sprint training code (with initialized student from SANA-Sprint Teacher 0.6B) using both implementation versions: the original SANA repo, and diffusers training version. Both experiments failed when the student prediction is quite noisy and does not produce high quality images. I didn't modify anything from the codebase except using a different training image-text pairs dataset with approximately 500k samples (with fixed resolution 1024x1024). The student model was trained in roughly 50 epochs with batch size = 8 and below is the output validation results:

Image

Questions: Do you have any tips or tricks during the training process? And do you think that the model still needs more training iterations to better converge? Appreciate your help!

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions