Skip to content

[Examples] Improve the model card pushed from the train_text_to_image.py script #3810

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 16 commits into from
Jun 20, 2023

Conversation

sayakpaul
Copy link
Member

@sayakpaul sayakpaul commented Jun 16, 2023

Currently, our training examples include limited information in the model cards. For example, one doesn't know how the code should be run with the model artifact(s) from the training examples just by taking a look at the corresponding model repository (such as this one).

We can improve this.

So, to that end, this PR modifies the train_text_to_image.py to include a code snippet in the model card that we create from the example itself. Additionally, it includes the following things in the model card for a better developer experience which can be enabled directly from the Hub:

  • Info on key training hyperparameters
  • Dataset (and properly links that to the model card metadata)
  • Weights and Biases run page when available (which discloses all the CLI args passed to the training script and details on the env such as accelerator type, etc.)

Here's how a pipeline repo looks like with the changes introduced in this PR: https://huggingface.co/sayakpaul/new_sd_pokemon.

The training was conducted with the following command:

export MODEL_NAME="CompVis/stable-diffusion-v1-4"
export DATASET_NAME="lambdalabs/pokemon-blip-captions"

accelerate launch train_text_to_image.py   \
  --pretrained_model_name_or_path=$MODEL_NAME   \
  --dataset_name=$DATASET_NAME   \
  --mixed_precision="fp16"   \
  --resolution=512 --center_crop --random_flip   \
  --train_batch_size=1   \
  --gradient_accumulation_steps=4   \
  --gradient_checkpointing   \
  --max_train_steps=100   \
  --learning_rate=1e-05   \
  --max_grad_norm=1   --lr_scheduler="constant" --lr_warmup_steps=0   \
  --output_dir="sd-pokemon-model"   \
  --report_to="wandb"   \
  --validation_epochs=1 \
  --validation_prompts "cute dragon creature" "cute pokemon creature" "blue pokemon"   \
  --checkpointing_steps=50   \
  --output_dir="new_sd_pokemon"   \
  --push_to_hub

If there's a consensus, I can adapt these changes and include them in the rest of the scripts. Thanks to @osanseviero for the hint.

@HuggingFaceDocBuilderDev
Copy link

HuggingFaceDocBuilderDev commented Jun 16, 2023

The documentation is not available anymore as the PR was closed or merged.

Copy link
Contributor

@patrickvonplaten patrickvonplaten left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me

image_grid.save(os.path.join(repo_folder, "val_imgs_grid.png"))
img_str += "![val_imgs_grid](./val_imgs_grid.png)\n"

yaml = f"""
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FYI huggingface_hub has nice utilities for model card creation (both content and metadata) and templates you could leverage here.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for suggesting!

But I feel like the current way of creating the content in the model card is a bit more explicit and also flexible.

I could leverage something like https://huggingface.co/docs/huggingface_hub/guides/model-cards#from-a-jinja-template, but I would not here because:

  • It doesn't reduce the code complexity significantly.
  • Now, users will have to learn about the model card related details of huggingface_hub, which I'd like to avoid because that's not the objective here.

Cc: @pcuenca

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the ping @osanseviero. I would also advocate to use huggingface_hub's modelcards for a few reasons:

  1. You need to be careful about encoding issues in the README (especially \n vs \r\n on Windows). It's quite specific and annoying but we did a few iterations in hfh to handle that correctly. The goal being to avoid big diffs if someone else updates the model card afterwards.
  2. Using modelcards + a separate jinja template makes it really easy for non-developers to review the model card template without looking into the code. This can prove useful if someone from the ethics team (for example) wants to open a PR to complete the model card template without digging to the exact code (i.e. separate code and templates to separate usage).
  3. While it doesn't reduce much complexity, it doesn't add much either. I don't think users have to understand ModelCards internal details to use it correctly.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That being said, I don't think the Path('custom_template.md').write_text(template_text) line from the docs should be reused. I think it would be best to provide a jinja template alongside the training script and only have:

card_data = ModelCardData(language='en', license='mit', library_name='keras')
card = ModelCard.from_template(card_data, template_path='custom_template.md', author='nateraw')
card.save('README.md')

in the training script

Copy link
Collaborator

@Wauplin Wauplin Jun 20, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(thinking out loud here but) actually reusing the default modelcard template is also an option.

It simplifies the example training script on your side at the cost of a more verbose model card -with a lot of empty fields at first-. We can see this as a way to encourage users to document better their models. The advantage of the default template is that it has been iterated multiple times to be compliant with what should be the standard in term of model cards.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am fine having a more verbose model card but having to provide a Jinja template separately is something I am not comfortable doing for our examples.

Using modelcards + a separate jinja template makes it really easy for non-developers to review the model card template without looking into the code. This can prove useful if someone from the ethics team (for example) wants to open a PR to complete the model card template without digging to the exact code (i.e. separate code and templates to separate usage).

I think they would need to open up the PR editing the README file anyway whose content is straightforward IMO.

Probably, the best is to reuse default modelcard template as you mentioned. Happy to accept a PR to see the changes.

@@ -971,6 +1059,7 @@ def collate_fn(examples):
pipeline.save_pretrained(args.output_dir)

if args.push_to_hub:
save_model_card(args, repo_id, images, repo_folder=args.output_dir)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you separately create the images that are saved to the model card here? Feel free to just pull the subset of code out of the log_validation_images that creates the images into a helper method.

Copy link
Member

@pcuenca pcuenca Jun 17, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Either what @williamberman said, or skip the image block in the model card if validation is disabled. I'd prefer to have the images, but we may not be able to generate them unless we make up a prompt that may not be ideal for the particular fine-tune the user performed.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you see the logic in the new save_model_card() function, that is how it's currently done.

If there are validation images to save to the model card, img_str will be crafted accordingly, otherwise, it will be none.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, that works for me.

Copy link
Member

@pcuenca pcuenca left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a great improvement, just focused on some details. I agree with @osanseviero that we could maybe leverage the facilities in huggingface_hub (and we already have Jinja2 as a requirement for training).

@@ -971,6 +1059,7 @@ def collate_fn(examples):
pipeline.save_pretrained(args.output_dir)

if args.push_to_hub:
save_model_card(args, repo_id, images, repo_folder=args.output_dir)
Copy link
Member

@pcuenca pcuenca Jun 17, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Either what @williamberman said, or skip the image block in the model card if validation is disabled. I'd prefer to have the images, but we may not be able to generate them unless we make up a prompt that may not be ideal for the particular fine-tune the user performed.

@sayakpaul
Copy link
Member Author

There's a 504 error coming from the HF Hub currently. Will try again later.

@sayakpaul
Copy link
Member Author

Model card with the latest details:

https://huggingface.co/sayakpaul/da-vinci-sd-pokemon

@sayakpaul sayakpaul merged commit 4870626 into main Jun 20, 2023
@sayakpaul sayakpaul deleted the refactor/readme-examples branch June 20, 2023 03:29
AmericanPresidentJimmyCarter pushed a commit to AmericanPresidentJimmyCarter/diffusers that referenced this pull request Apr 26, 2024
…e.py` script (huggingface#3810)

* refactor: readme serialized from the example when push_to_hub is True.

* fix: batch size arg.

* a bit better formatting

* minor fixes.

* add note on env.

* Apply suggestions from code review

Co-authored-by: Pedro Cuenca <[email protected]>

* condition wandb info better

* make mixed_precision assignment in cli args explicit.

* separate inference block for sample images.

* Apply suggestions from code review

Co-authored-by: Pedro Cuenca <[email protected]>

* address more comments.

* autocast mode.

* correct none image type problem.

* ifx: list assignment.

* minor fix.

---------

Co-authored-by: Pedro Cuenca <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants