Notes on "A Mathematical Framework for Transformer Circuits"
Close-reading a classic interpretability paper and trying to make sense of it