Tagged "ml"

10 Autoencoders in a Trenchcoat, part 1

Notes on the core sections of Anthropic's Toy Models of Supervision.


10 Autoencoders in a Trenchcoat, part 1

Notes on the core sections of Anthropic's Toy Models of Supervision.


Notes on "A Mathematical Framework for Transformer Circuits"

Close-reading a classic interpretability paper and trying to make sense of it


What's different about a Matryoshka SAE?

Brief notes from the Matryoshka SAEs paper.