Skip to content

Jordine/interp_variable_list

Repository files navigation

interp_variable_list

Training and interpreting a transformer to sort lists of variable length. From Neel Nanda's 200 concrete open problems in mechanistic interpretability. in progress.

Trained a 1 layer 4 head attn only transformer Attention patterns

Accuracies Accuracies with seqlens

About

training a transformer to sort variable length list, and doing interp on it. in progress.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published