Singular learning theory
Sumio Watanabe’s Singular Learning Theory Overview
Singular Learning Theory - LessWrong
Singular learning theory is a theory that applies algebraic geometry to statistical learning theory, developed by Sumio Watanabe. Reference textbooks are “the grey book”, Algebraic Geometry and Statistical Learning Theory, and “the green book”, Mathematical Theory of Bayesian Statistics.
It’s a rigorous mathematical theory of statistical learning, and it can help us understand phase transition, which seems to be important for mechanistic interpretability. For example, the forming of circuits can be seen as a phase transition.
Prerequisite for understanding this: statistical concepts, e.g. gaussian, dimensions, marginal distribution.
Notes mentioning this note
Developmental interpretability
Developmental Interpretability: A Novel AI Alignment Research Agenda Developmental interpretability is a novel AI alignment research agenda studying how structure forms in...
Mechanistic interpretability
Mechanistic interpretability emphasizes features and circuits as the fundamental units of analysis and usually aims at understanding a fully trained...