Skip to content

Add Bloom Model #1382

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 40 commits into from
Jan 15, 2024
Merged

Add Bloom Model #1382

merged 40 commits into from
Jan 15, 2024

Conversation

abuelnasr0
Copy link
Contributor

@abuelnasr0 abuelnasr0 commented Dec 27, 2023

The architecture is done. and the generates output successfully.
remaining two tasks:

  • add documentation
  • checkpoint conversion

once I finish, I will mention.

Copy link
Member

@mattdangerw mattdangerw left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you!! This is awesome.

Left some initial comments. Adding a test file for the backbone should catch some things.

Just a heads up most everyone will be out for new years, so the next review will probably be next week!

@mattdangerw
Copy link
Member

Make sure to run ./shell/format.sh too.

@abuelnasr0
Copy link
Contributor Author

abuelnasr0 commented Jan 1, 2024

checkpoint conversion script worked fine and the model produced output that is close to the huggingface output.
check this Gist : https://colab.research.google.com/gist/abuelnasr0/1edd8f43cb05630cc51c9823002e763c/bloom.ipynb

@abuelnasr0 abuelnasr0 requested a review from mattdangerw January 2, 2024 18:23
Copy link
Member

@mattdangerw mattdangerw left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Left some more comments on the code.

But maybe more importantly, I am looking into license here. We just integrated with Kaggle https://github.com/keras-team/keras-nlp/releases/tag/v0.7.0, which I believe gives us a way to support the open-RAIL license that bloom weights are release under. But we need to double check this. Hope to have an answer next week!

@abuelnasr0
Copy link
Contributor Author

@mattdangerw about the license. If you follow this link https://huggingface.co/bigscience/bloom#uses, you will find a hyperlink (BLOOM license) which points to this License: https://huggingface.co/spaces/bigscience/license

Also I have found these two licenses:

  1. The BigScience RAIL License: https://bigscience.huggingface.co/blog/the-bigscience-rail-license
  2. a license in a github repo but it's mentioned that the repo is deprecated: https://github.com/bigscience-workshop/model_card

@abuelnasr0
Copy link
Contributor Author

check this gist to see model output compared to huggingface after applying requested changes: https://colab.research.google.com/gist/abuelnasr0/22877985ce1a1c9125e8ed46cfc87da2/bloom.ipynb

Copy link
Member

@mattdangerw mattdangerw left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! This looks good, and I think we are good to land the architecture here.

Can you do two things?

  1. Rebase or merge to the latest changes to see if test are passing again? (we had a keras 3 breakage yesterday)

  2. Send me your kaggle username if you have one?

@abuelnasr0
Copy link
Contributor Author

abuelnasr0 commented Jan 11, 2024

Send me your kaggle username if you have one?

my kaggle username: mohamedabuelnasr

@mattdangerw mattdangerw added the kokoro:force-run Runs Tests on GPU label Jan 11, 2024
@kokoro-team kokoro-team removed the kokoro:force-run Runs Tests on GPU label Jan 11, 2024
@mattdangerw
Copy link
Member

@abuelnasr0 thanks! Sorry for the delay here, but I think you have been added to a list that will allow you to upload models.

I will pull this PR in, then you can proceed roughly as follows...

  1. Create a PR for a tokenizer.
  2. Update the conversion script to output to our new preset format. Use Update llama conversion script for new kaggle format #1402 as a rough reference. That will save weights in a new format ready for kaggle (essentially a directory with a config.json, tokenizer.json, models.weights.h5 and some tokenizer assets). Note the script expects you have installed the keras_nlp package using python pip_build.py --install.
  3. Using the Kaggle UI -> https://www.kaggle.com/models/?new=true create a new bloom model under your username, and upload variants where variant name == preset name. You can just drag the entire contents of a local preset directory into the kaggle upload UI.
  4. Create a new preset file following the pattern here, where essentially we just record some metadata and a link to the kaggle model. The link form will be kaggle://YOUR_USERNAME/bloom/keras/PRESET_ID/1.
  5. Add preset tests.

One other note, the largest models 7b & 176b will require a lot of ram to load, even on a CPU. Feel free to just test the conversion with the smaller models, and we can do the conversion for the larger models on our own compute resources.

This is our first time going through this new Kaggle upload flow, so please let us know any feedback!

@mattdangerw mattdangerw merged commit c41e844 into keras-team:master Jan 15, 2024
@abuelnasr0
Copy link
Contributor Author

@mattdangerw Thanks for the merge and the instructions. I will open the PR and add the models as soon as possible.

@abuelnasr0 abuelnasr0 deleted the bloom branch January 16, 2024 20:59
@SamanehSaadat
Copy link
Member

Hi @abuelnasr0 !

Thanks for contributing this model. I'm working on the Falcon model which similar to the Bloom model, uses alibi. I was wondering if you are interested in separating your alibi implementation and making it reusable.

@abuelnasr0
Copy link
Contributor Author

@SamanehSaadat sure. I will open a PR for it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants