Skip to content

changes #2793

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Apr 8, 2025
Merged

changes #2793

merged 1 commit into from
Apr 8, 2025

Conversation

eliebak
Copy link
Contributor

@eliebak eliebak commented Apr 7, 2025

Smol changes for Llama4 blog post:

  • Changes the link of HF to transformer version
  • Add detail about MetaP and Co distillation
    Little changes on the arch part @pcuenca
  • remove the mention suggesting QK Norm help long context (see https://x.com/_arohan_/status/1908737240425177145)
  • it's RMS norm (despite the naming) and added the fact that there is no learnable param

@Vaibhavs10 Vaibhavs10 merged commit 71b03e6 into main Apr 8, 2025
@Vaibhavs10 Vaibhavs10 deleted the llama4_modif branch April 8, 2025 16:32
sergiopaniego pushed a commit to sergiopaniego/blog that referenced this pull request Apr 8, 2025
@pcuenca
Copy link
Member

pcuenca commented Apr 8, 2025

Thank you @eliebak 🙌

You are right, it's RMS, I got misled by the class name in the original codebase.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants