nsorros .com
online
← back to writing

Interpretability of LLMs

What LLM interpretability buys you and what it doesn't — mechanistic vs behavioural, what shows up in attention vs what doesn't, and where the field's tools are still too brittle to use in production. Aimed at engineers shipping under any kind of regulatory or auditability pressure.