[Examples] Improve the model card pushed from the `train_text_to_image.py` script #3810

sayakpaul · 2023-06-16T12:30:13Z

Currently, our training examples include limited information in the model cards. For example, one doesn't know how the code should be run with the model artifact(s) from the training examples just by taking a look at the corresponding model repository (such as this one).

We can improve this.

So, to that end, this PR modifies the train_text_to_image.py to include a code snippet in the model card that we create from the example itself. Additionally, it includes the following things in the model card for a better developer experience which can be enabled directly from the Hub:

Info on key training hyperparameters
Dataset (and properly links that to the model card metadata)
Weights and Biases run page when available (which discloses all the CLI args passed to the training script and details on the env such as accelerator type, etc.)

Here's how a pipeline repo looks like with the changes introduced in this PR: https://huggingface.co/sayakpaul/new_sd_pokemon.

The training was conducted with the following command:

export MODEL_NAME="CompVis/stable-diffusion-v1-4"
export DATASET_NAME="lambdalabs/pokemon-blip-captions"

accelerate launch train_text_to_image.py   \
  --pretrained_model_name_or_path=$MODEL_NAME   \
  --dataset_name=$DATASET_NAME   \
  --mixed_precision="fp16"   \
  --resolution=512 --center_crop --random_flip   \
  --train_batch_size=1   \
  --gradient_accumulation_steps=4   \
  --gradient_checkpointing   \
  --max_train_steps=100   \
  --learning_rate=1e-05   \
  --max_grad_norm=1   --lr_scheduler="constant" --lr_warmup_steps=0   \
  --output_dir="sd-pokemon-model"   \
  --report_to="wandb"   \
  --validation_epochs=1 \
  --validation_prompts "cute dragon creature" "cute pokemon creature" "blue pokemon"   \
  --checkpointing_steps=50   \
  --output_dir="new_sd_pokemon"   \
  --push_to_hub

If there's a consensus, I can adapt these changes and include them in the rest of the scripts. Thanks to @osanseviero for the hint.

HuggingFaceDocBuilderDev · 2023-06-16T12:36:40Z

The documentation is not available anymore as the PR was closed or merged.

patrickvonplaten

Looks good to me

osanseviero · 2023-06-16T16:51:20Z

examples/text_to_image/train_text_to_image.py

+        image_grid.save(os.path.join(repo_folder, "val_imgs_grid.png"))
+        img_str += "![val_imgs_grid](./val_imgs_grid.png)\n"
+
+    yaml = f"""


FYI huggingface_hub has nice utilities for model card creation (both content and metadata) and templates you could leverage here.

See https://huggingface.co/docs/huggingface_hub/guides/model-cards

Thanks for suggesting!

But I feel like the current way of creating the content in the model card is a bit more explicit and also flexible.

I could leverage something like https://huggingface.co/docs/huggingface_hub/guides/model-cards#from-a-jinja-template, but I would not here because:

It doesn't reduce the code complexity significantly.

Now, users will have to learn about the model card related details of huggingface_hub, which I'd like to avoid because that's not the objective here.

Cc: @pcuenca

cc @Wauplin

Thanks for the ping @osanseviero. I would also advocate to use huggingface_hub's modelcards for a few reasons:

You need to be careful about encoding issues in the README (especially \n vs \r\n on Windows). It's quite specific and annoying but we did a few iterations in hfh to handle that correctly. The goal being to avoid big diffs if someone else updates the model card afterwards.

Using modelcards + a separate jinja template makes it really easy for non-developers to review the model card template without looking into the code. This can prove useful if someone from the ethics team (for example) wants to open a PR to complete the model card template without digging to the exact code (i.e. separate code and templates to separate usage).

While it doesn't reduce much complexity, it doesn't add much either. I don't think users have to understand ModelCards internal details to use it correctly.

That being said, I don't think the Path('custom_template.md').write_text(template_text) line from the docs should be reused. I think it would be best to provide a jinja template alongside the training script and only have:

card_data = ModelCardData(language='en', license='mit', library_name='keras') card = ModelCard.from_template(card_data, template_path='custom_template.md', author='nateraw') card.save('README.md')

in the training script

(thinking out loud here but) actually reusing the default modelcard template is also an option.

It simplifies the example training script on your side at the cost of a more verbose model card -with a lot of empty fields at first-. We can see this as a way to encourage users to document better their models. The advantage of the default template is that it has been iterated multiple times to be compliant with what should be the standard in term of model cards.

I am fine having a more verbose model card but having to provide a Jinja template separately is something I am not comfortable doing for our examples.

Using modelcards + a separate jinja template makes it really easy for non-developers to review the model card template without looking into the code. This can prove useful if someone from the ethics team (for example) wants to open a PR to complete the model card template without digging to the exact code (i.e. separate code and templates to separate usage).

I think they would need to open up the PR editing the README file anyway whose content is straightforward IMO.

Probably, the best is to reuse default modelcard template as you mentioned. Happy to accept a PR to see the changes.

williamberman · 2023-06-16T17:30:25Z

examples/text_to_image/train_text_to_image.py

@@ -971,6 +1059,7 @@ def collate_fn(examples):
        pipeline.save_pretrained(args.output_dir)

        if args.push_to_hub:
+            save_model_card(args, repo_id, images, repo_folder=args.output_dir)


Can you separately create the images that are saved to the model card here? Feel free to just pull the subset of code out of the log_validation_images that creates the images into a helper method.

Either what @williamberman said, or skip the image block in the model card if validation is disabled. I'd prefer to have the images, but we may not be able to generate them unless we make up a prompt that may not be ideal for the particular fine-tune the user performed.

If you see the logic in the new save_model_card() function, that is how it's currently done.

If there are validation images to save to the model card, img_str will be crafted accordingly, otherwise, it will be none.

Yeah, that works for me.

pcuenca

This is a great improvement, just focused on some details. I agree with @osanseviero that we could maybe leverage the facilities in huggingface_hub (and we already have Jinja2 as a requirement for training).

examples/text_to_image/train_text_to_image.py

pcuenca · 2023-06-17T18:41:51Z

examples/text_to_image/train_text_to_image.py

@@ -971,6 +1059,7 @@ def collate_fn(examples):
        pipeline.save_pretrained(args.output_dir)

        if args.push_to_hub:
+            save_model_card(args, repo_id, images, repo_folder=args.output_dir)


Either what @williamberman said, or skip the image block in the model card if validation is disabled. I'd prefer to have the images, but we may not be able to generate them unless we make up a prompt that may not be ideal for the particular fine-tune the user performed.

examples/text_to_image/train_text_to_image.py

Co-authored-by: Pedro Cuenca <[email protected]>

examples/text_to_image/train_text_to_image.py

sayakpaul · 2023-06-18T04:27:41Z

There's a 504 error coming from the HF Hub currently. Will try again later.

examples/text_to_image/train_text_to_image.py

Co-authored-by: Pedro Cuenca <[email protected]>

sayakpaul · 2023-06-20T03:13:56Z

Model card with the latest details:

https://huggingface.co/sayakpaul/da-vinci-sd-pokemon

…e.py` script (huggingface#3810) * refactor: readme serialized from the example when push_to_hub is True. * fix: batch size arg. * a bit better formatting * minor fixes. * add note on env. * Apply suggestions from code review Co-authored-by: Pedro Cuenca <[email protected]> * condition wandb info better * make mixed_precision assignment in cli args explicit. * separate inference block for sample images. * Apply suggestions from code review Co-authored-by: Pedro Cuenca <[email protected]> * address more comments. * autocast mode. * correct none image type problem. * ifx: list assignment. * minor fix. --------- Co-authored-by: Pedro Cuenca <[email protected]>

sayakpaul added 5 commits June 16, 2023 16:36

refactor: readme serialized from the example when push_to_hub is True.

511abcb

fix: batch size arg.

512aa05

a bit better formatting

4ed17ed

minor fixes.

5d31fb9

add note on env.

13b9a8b

sayakpaul requested review from pcuenca, williamberman, yiyixuxu and patrickvonplaten June 16, 2023 12:30

patrickvonplaten approved these changes Jun 16, 2023

View reviewed changes

osanseviero reviewed Jun 16, 2023

View reviewed changes

williamberman reviewed Jun 16, 2023

View reviewed changes

pcuenca reviewed Jun 17, 2023

View reviewed changes

sayakpaul and others added 3 commits June 18, 2023 08:54

Apply suggestions from code review

e8e6ac7

Co-authored-by: Pedro Cuenca <[email protected]>

condition wandb info better

da9e00b

make mixed_precision assignment in cli args explicit.

507d2cc

sayakpaul commented Jun 18, 2023

View reviewed changes

examples/text_to_image/train_text_to_image.py Show resolved Hide resolved

sayakpaul commented Jun 18, 2023

View reviewed changes

examples/text_to_image/train_text_to_image.py Outdated Show resolved Hide resolved

separate inference block for sample images.

dca3f87

pcuenca approved these changes Jun 18, 2023

View reviewed changes

sayakpaul and others added 6 commits June 20, 2023 07:40

Apply suggestions from code review

2890d5b

Co-authored-by: Pedro Cuenca <[email protected]>

address more comments.

5eecea5

autocast mode.

0bd364c

correct none image type problem.

431daf0

ifx: list assignment.

3b6ced0

Merge branch 'main' into refactor/readme-examples

a4f9858

minor fix.

a743121

sayakpaul merged commit 4870626 into main Jun 20, 2023

sayakpaul deleted the refactor/readme-examples branch June 20, 2023 03:29

[Examples] Improve the model card pushed from the train_text_to_image.py script #3810

[Examples] Improve the model card pushed from the train_text_to_image.py script #3810

Uh oh!

Conversation

sayakpaul commented Jun 16, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

HuggingFaceDocBuilderDev commented Jun 16, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

patrickvonplaten left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Wauplin Jun 20, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

pcuenca Jun 17, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

pcuenca left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

pcuenca Jun 17, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

sayakpaul commented Jun 18, 2023

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

sayakpaul commented Jun 20, 2023

Uh oh!

Uh oh!

[Examples] Improve the model card pushed from the `train_text_to_image.py` script #3810

[Examples] Improve the model card pushed from the `train_text_to_image.py` script #3810

sayakpaul commented Jun 16, 2023 •

edited

Loading

HuggingFaceDocBuilderDev commented Jun 16, 2023 •

edited

Loading

Wauplin Jun 20, 2023 •

edited

Loading

pcuenca Jun 17, 2023 •

edited

Loading

pcuenca Jun 17, 2023 •

edited

Loading