Armen Aghajanyan
Subscribe
Mathematics
Proof: Attention Values Correlate After Layer 1
And what it means for the L1 vs L2 normalization debate.
Mathematics
AI
Jan 6, 2026