AI Research🇰🇷 한국어

From Logit Lens to Tuned Lens: Reading the Intermediate Thoughts of Transformers

What happens inside an LLM between input and output? Logit Lens and Tuned Lens let us observe how Transformers build predictions layer by layer.

From Logit Lens to Tuned Lens: Reading the Intermediate Thoughts of Transformers

From Logit Lens to Tuned Lens: Reading the Intermediate Thoughts of Transformers

You type "The capital of France is" into an LLM and get back "Paris." But *where* inside the model did that answer actually form?

TL;DR

  • Logit Lens projects intermediate hidden states to vocabulary space using the model's final unembedding matrix
  • This reveals how Transformers build predictions incrementally, layer by layer
🔒

Sign in to continue reading

Create a free account to access the full content.

Related Posts