We dissect transformer models with code, running causal experiments: tracing activations, patching signals between layers, and deliberately breaking model behaviors to understand what changed—and why. A hands-on tour of mechanistic interpretability, in practice.
This talk treats transformer language models as experimental creatures. We open them up in code, expose attention heads, residual streams, and neuron activations, and ask precise questions by intervening directly in their internal machinery.
Across a series of hands-on “Frankenstein experiments,” we trace activations through layers, patch signals between model states, and deliberately rewire internal components to observe how behavior mutates. These interventions reveal concrete circuits behind capabilities such as copying, induction, and factual recall, and allow those hypotheses to be tested causally rather than inferred.
All experiments are implemented in Python using PyTorch, TransformerLens, and nnterp. The focus is on post-hoc investigation: reading internal representations, performing controlled manipulations, and measuring how specific internal changes reshape language generation.
The session presents mechanistic interpretability as an experimental science: models are dissected, modified, and reassembled to see what survives. Attendees will leave with a working mental model of transformer internals, practical tools for running their own interpretability experiments, and a sharper intuition for how reasoning emerges from neural machinery.
Giuseppe Birardi is CTO at Agreenlab and Ormalab, working with Python, AI, and applied research. His background in cultural anthropology informs a practical approach to language models, interpretability, and system design.
He recently took part in ARENA 7.0, focusing on mechanistic interpretability and causal experiments on transformer models, alongside ongoing research on automated circuit analysis and probing methods
Giuseppe_Birardi_CV
. He has published in international journals and remains committed to bridging research and real-world applications. He co-organizes PyBari and experiments with wild fermentation.