Skip to content

Adding GPTNeoXBackbone #1056

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 27 commits into from
Jun 25, 2023
Merged
Changes from 1 commit
Commits
Show all changes
27 commits
Select commit Hold shift + click to select a range
9412a83
added gpt-neo attention+decoder+backbone
kanpuriyanawab May 29, 2023
99a8296
fixed formatting + added backbone test
kanpuriyanawab May 29, 2023
afb7e1f
fixed rotary embedding and gpt neo attention layer
kanpuriyanawab Jun 6, 2023
f0f6383
updating decoder and backbone to current version
kanpuriyanawab Jun 6, 2023
bfd56fa
fixed decoder + backbone
kanpuriyanawab Jun 7, 2023
97a347d
fix forward pass
kanpuriyanawab Jun 10, 2023
5ead767
formatting + add checkpoint script
kanpuriyanawab Jun 10, 2023
5776ac1
fix tpu_test, formatting
kanpuriyanawab Jun 10, 2023
e0d343b
removed unnecessary layernorms, correct arguments, fix unit tests (te…
kanpuriyanawab Jun 12, 2023
451cdbc
fix dropout
kanpuriyanawab Jun 12, 2023
e37fb22
matching outputs with hf
kanpuriyanawab Jun 14, 2023
ead11c5
fix formating
kanpuriyanawab Jun 14, 2023
c7117a4
resolving few comments
kanpuriyanawab Jun 14, 2023
c72e629
fixed unit tests + formatting
kanpuriyanawab Jun 16, 2023
2341d0e
refactored rotary embedding
kanpuriyanawab Jun 16, 2023
6112357
revamped checkpoint conversion script
kanpuriyanawab Jun 16, 2023
66afa7c
code format
kanpuriyanawab Jun 16, 2023
f363f24
putting old checkpoint script back until preset
kanpuriyanawab Jun 16, 2023
7a66052
incorporated comments
kanpuriyanawab Jun 17, 2023
6f6f41e
code format
kanpuriyanawab Jun 17, 2023
f34ec47
resolved comments + fixed formatting
kanpuriyanawab Jun 17, 2023
34db7f7
added gpt neo x tokenizer
kanpuriyanawab Jun 17, 2023
1ecfe51
added docstrings
kanpuriyanawab Jun 21, 2023
b3f06e4
formatting fix
kanpuriyanawab Jun 21, 2023
a9f2230
addressing comments
kanpuriyanawab Jun 23, 2023
122a3fb
added tokenizer output verification
kanpuriyanawab Jun 23, 2023
e10ea50
Minor style fixes
mattdangerw Jun 24, 2023
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Prev Previous commit
Next Next commit
fix dropout
  • Loading branch information
kanpuriyanawab committed Jun 12, 2023
commit 451cdbc51b33fb076a5ad926ce43395d6443a068
2 changes: 1 addition & 1 deletion keras_nlp/models/gpt_neox/gpt_neox_backbone.py
Original file line number Diff line number Diff line change
Expand Up @@ -34,7 +34,7 @@ def __init__(
num_heads,
hidden_dim,
intermediate_dim,
dropout=0.1,
dropout=0.,
rotary_pct=0.25,
rotary_emb_base=10000,
layer_norm_epsilon=1e-5,
Expand Down