Media Summary: When Anthropic tested Claude Sonnet 4.5 for alignment, the model appeared perfectly behaved — but it turned out the model had ... Warning: This is an ad-libbed talk, and I'm sure I got some facts wrong. This is a talk I gave to my MATS 9.0 training program on ... We don't know how AIs think or why they do what they do. Or at least, we don't know much. That fact is only becoming more ...

Neel Nanda On Avoiding An - Detailed Analysis & Overview

When Anthropic tested Claude Sonnet 4.5 for alignment, the model appeared perfectly behaved — but it turned out the model had ... Warning: This is an ad-libbed talk, and I'm sure I got some facts wrong. This is a talk I gave to my MATS 9.0 training program on ... We don't know how AIs think or why they do what they do. Or at least, we don't know much. That fact is only becoming more ... PART 1* — a comprehensive update on mechanistic interpretability: At 26, ... This is a talk I gave to my MATS scholars, with a stylised history of the field of mechanistic interpretability, as I see it (with a focus ... SPONSOR MESSAGES: *** CentML offers competitive pricing for GenAI model deployment, with flexible options to suit a wide ...

How good are we at understanding the internal computation of advanced machine learning models, and do we have a hope at ... This is a talk I gave to my MATS 9.0 training scholars about the big picture of mech interp - as of Oct 2025, what had changed?

Photo Gallery

Neel Nanda on Avoiding an AI Catastrophe with Mechanistic Interpretability
Neel Nanda - Our Pivot To Pragmatic Interpretability [Alignment Workshop]
Neel Nanda - Our Pivot To Pragmatic Interpretability [Alignment Workshop]
What Happened With Sparse Autoencoders?
We Can Monitor AI’s Thoughts… For Now | Google DeepMind's Neel Nanda
Stand Up Comedy from Neel Nanda
I lead a Google DeepMind team at 26. If you want to work at an AI company... | Neel Nanda (Part 2)
Can LLMs Introspect? A Live Paper Review
Don’t use the phrase "No Homo" | Neel Nanda | Stand-Up Comedy
The Story of Mech Interp
NEURAL NETWORKS ARE WEIRD! - Neel Nanda (DeepMind)
19 - Mechanistic Interpretability with Neel Nanda
Sponsored
Sponsored
View Detailed Profile
Neel Nanda on Avoiding an AI Catastrophe with Mechanistic Interpretability

Neel Nanda on Avoiding an AI Catastrophe with Mechanistic Interpretability

Neel Nanda

Neel Nanda - Our Pivot To Pragmatic Interpretability [Alignment Workshop]

Neel Nanda - Our Pivot To Pragmatic Interpretability [Alignment Workshop]

Neel Nanda

Sponsored
Neel Nanda - Our Pivot To Pragmatic Interpretability [Alignment Workshop]

Neel Nanda - Our Pivot To Pragmatic Interpretability [Alignment Workshop]

When Anthropic tested Claude Sonnet 4.5 for alignment, the model appeared perfectly behaved — but it turned out the model had ...

What Happened With Sparse Autoencoders?

What Happened With Sparse Autoencoders?

Warning: This is an ad-libbed talk, and I'm sure I got some facts wrong. This is a talk I gave to my MATS 9.0 training program on ...

We Can Monitor AI’s Thoughts… For Now | Google DeepMind's Neel Nanda

We Can Monitor AI’s Thoughts… For Now | Google DeepMind's Neel Nanda

We don't know how AIs think or why they do what they do. Or at least, we don't know much. That fact is only becoming more ...

Sponsored
Stand Up Comedy from Neel Nanda

Stand Up Comedy from Neel Nanda

Stand Up Comedy from

I lead a Google DeepMind team at 26. If you want to work at an AI company... | Neel Nanda (Part 2)

I lead a Google DeepMind team at 26. If you want to work at an AI company... | Neel Nanda (Part 2)

PART 1* — a comprehensive update on mechanistic interpretability: https://www.youtube.com/watch?v=5FdO1MEumbI At 26, ...

Can LLMs Introspect? A Live Paper Review

Can LLMs Introspect? A Live Paper Review

Paper: https://transformer-circuits.pub/2025/introspection/index.html Tweet thread: ...

Don’t use the phrase "No Homo" | Neel Nanda | Stand-Up Comedy

Don’t use the phrase "No Homo" | Neel Nanda | Stand-Up Comedy

Neel Nanda

The Story of Mech Interp

The Story of Mech Interp

This is a talk I gave to my MATS scholars, with a stylised history of the field of mechanistic interpretability, as I see it (with a focus ...

NEURAL NETWORKS ARE WEIRD! - Neel Nanda (DeepMind)

NEURAL NETWORKS ARE WEIRD! - Neel Nanda (DeepMind)

SPONSOR MESSAGES: *** CentML offers competitive pricing for GenAI model deployment, with flexible options to suit a wide ...

19 - Mechanistic Interpretability with Neel Nanda

19 - Mechanistic Interpretability with Neel Nanda

How good are we at understanding the internal computation of advanced machine learning models, and do we have a hope at ...

What Matters Right Now In Mechanistic Interpretability?

What Matters Right Now In Mechanistic Interpretability?

This is a talk I gave to my MATS 9.0 training scholars about the big picture of mech interp - as of Oct 2025, what had changed?

DeepMind's Neel Nanda says shutdown resistance ≠ rebellion. Often it's confusion.

DeepMind's Neel Nanda says shutdown resistance ≠ rebellion. Often it's confusion.

DeepMind's

Related Video Content

NEEL - YouTube information

NEEL is an R&B/Pop, Bollywood-influenced singer-songwriter and producer based out of NYC. Born and raised in New...

Alice Neel - Wikipedia information

In the 1930s, Neel gained a reputation as an artist, and established a good standing within her circle of downtown...

NEEL-TRIMARANS information

NEEL-TRIMARANS is the worldwide leader designing and building cruising trimarans. Discover the unique range of...

Alice Neel | Official Website information

Explore the life and works of Alice Neel, a pioneering American painter known for her humanist portraits and social...

NEEL - Facebook information

NEEL is a Pop/R&B and Bollywood-influenced artist based out of NYC. It started with ‘Hey Siri’ and ended with Alexa...