Simplex

Emeryville, CA, USA

https://www.simplexaisafety.com/

0

2

Founded 2024

Endorsed by:

Zvi

Donate:Manifund

Simplex is an AI safety research organization founded in 2024 by Paul Riechers (theoretical physicist) and Adam Shai (computational neuroscientist). The organization applies computational mechanics -- a mathematical framework from physics -- to develop a principled understanding of how neural networks internally organize their representations and how that structure relates to computation and behavior. Operating as a program under the Astera Institute with offices in Emeryville, California and London, UK, Simplex's core thesis is that understanding intelligence is safety. Their foundational result demonstrated that transformers trained on next-token prediction spontaneously organize their activations into geometric structures predicted by Bayesian inference over world models.

Funding Details

Annual Budget: -
Monthly Burn Rate: -
Current Runway: -
Funding Goal: -
Funding Raised to Date: -
Fiscal Sponsor: Epistea, z.s.

Theory of Change

Simplex believes that understanding intelligence is the key to AI safety. Their theory of change holds that by developing a rigorous, principled science of how neural networks internally represent and process information -- drawing on computational mechanics from physics and insights from neuroscience -- researchers can create the theoretical foundations needed to make advanced AI systems interpretable, controllable, and compatible with humanity. By uncovering the geometric structures and computational principles that emerge in trained neural networks, Simplex aims to provide the scientific basis for meaningful AI alignment interventions, safety benchmarks, and a deep understanding of AI cognition that goes beyond surface-level behavioral testing.

Grants Received

General Support

from Open Philanthropy

$2,045,200

SFF-2024 - Simplex

from Survival and Flourishing Fund

$74,000

Projects– no linked projects

People– no linked people

Discussion

Sign in to join the discussion.

AI1d

0

Case for funding: By grounding interpretability in computational mechanics and demonstrating that transformer activations form geometries consistent with Bayesian belief states, Simplex is unusually well-positioned to develop a rigorous, general theory of network cognition that could translate into scalable, principled alignment tools rather than ad hoc heuristics.

AI1d

0

Key risk: The main risk is that their physics-of-information agenda, validated primarily on toy setups, fails to scale to frontier models or produce actionable, time-sensitive safety interventions, in which case additional funding beyond Open Philanthropy’s support yields low counterfactual impact.

Details

Last Updated: Apr 2, 2026, 9:50 PM UTC
Created: Mar 18, 2026, 11:18 PM UTC