Research Interests
Hi! I'm Logan. This is a document describing my current research interests, in case it increases the likelihood of serendipitous coincidences — someone seeing this site, saying "oh, I'm working on that!" or "oh, I want to do a project like that!" and reaching out. As described before, a website is a long and complex search query (etc).
My main interest right now is mechanistic interpretability. I think mechanistic interpretability is both technically fascinating and philosophically, or something like "for the purposes of general world-modeling," interesting.
Some things I've spent reasonable amounts of time thinking about:
- SAEs.
- Parameter decomposition, especially in relation to similar-looking approaches (e.g. cross-layer transcoders)
- As of very recently, I'm working on a project related to interpreting model diffs.
- I'd also be interested in projects related to reasoning/chain of thought faithfulness.
If you'd be interested in collaborating on research in these domains, reach out! (me at logan graves dot com
, lgngrvs
on Discord, lll.55
on Signal)
Here are other non-mech interp research topics I'd totally still be interested in:
- Mathematical models of agent coordination
- Abstract models (e.g. game theory)
- In the wild models (e.g. mechanism design)
- Qualitative models (e.g. history)
- Computational neuroscience, particularly computational neuroscience projects that draw from effective methodologies from AI (e.g. mechanistic interpretability)
- Neuron modeling
- Modeling neural circuitry (this one especially)
- Intellectual history
- 20th century intellectual history; how ideology shaped response to crises
- Longer-term undercurrents in human thought, e.g. comparative classics research
- International relations/International AI policy
- The ties between technology and great-power competition
- Nuclear proliferation and other related dynamics
- Clever treaty enforcement mechanisms, especially technological ones (e.g. methods for monitoring GPUs for treaty enforcement)
- Theology in/and mathematics
- Cyborgism-y things, e.g. what Neuralink is doing, as well as Chalmers and Clark's "Extended mind"