Simplex is an AI safety research organization founded in 2024 by Paul Riechers (theoretical physicist) and Adam Shai (computational neuroscientist). The organization applies computational mechanics -- a mathematical framework from physics -- to develop a principled understanding of how neural networks internally organize their representations and how that structure relates to computation and behavior. Operating as a program under the Astera Institute with offices in Emeryville, California and London, UK, Simplex's core thesis is that understanding intelligence is safety. Their foundational result demonstrated that transformers trained on next-token prediction spontaneously organize their activations into geometric structures predicted by Bayesian inference over world models.
Simplex is an AI safety research organization founded in 2024 by Paul Riechers (theoretical physicist) and Adam Shai (computational neuroscientist). The organization applies computational mechanics -- a mathematical framework from physics -- to develop a principled understanding of how neural networks internally organize their representations and how that structure relates to computation and behavior. Operating as a program under the Astera Institute with offices in Emeryville, California and London, UK, Simplex's core thesis is that understanding intelligence is safety. Their foundational result demonstrated that transformers trained on next-token prediction spontaneously organize their activations into geometric structures predicted by Bayesian inference over world models.
Funding Details
- Annual Budget
- -
- Monthly Burn Rate
- -
- Current Runway
- -
- Funding Goal
- -
- Funding Raised to Date
- -
- Fiscal Sponsor
- Epistea, z.s.
Theory of Change
Simplex believes that understanding intelligence is the key to AI safety. Their theory of change holds that by developing a rigorous, principled science of how neural networks internally represent and process information -- drawing on computational mechanics from physics and insights from neuroscience -- researchers can create the theoretical foundations needed to make advanced AI systems interpretable, controllable, and compatible with humanity. By uncovering the geometric structures and computational principles that emerge in trained neural networks, Simplex aims to provide the scientific basis for meaningful AI alignment interventions, safety benchmarks, and a deep understanding of AI cognition that goes beyond surface-level behavioral testing.
Grants Received
from Open Philanthropy
from Survival and Flourishing Fund
Projects– no linked projects
People– no linked people
Discussion
Sign in to join the discussion.
Key risk: The main risk is that their physics-of-information agenda, validated primarily on toy setups, fails to scale to frontier models or produce actionable, time-sensitive safety interventions, in which case additional funding beyond Open Philanthropy’s support yields low counterfactual impact.
Details
- Last Updated
- Apr 2, 2026, 9:50 PM UTC
- Created
- Mar 18, 2026, 11:18 PM UTC
Case for funding: By grounding interpretability in computational mechanics and demonstrating that transformer activations form geometries consistent with Bayesian belief states, Simplex is unusually well-positioned to develop a rigorous, general theory of network cognition that could translate into scalable, principled alignment tools rather than ad hoc heuristics.