Experiments

VLM Convergence Experiment

Status: Complete. Both models tested.

Core question: If affect geometry is universal, do systems trained on human affect data (GPT-4o, Claude) independently recognize the same affect signatures in completely uncontaminated substrates?

Method: 48 behavioral vignettes extracted from / protocell data across 6 conditions (normal foraging, pre-drought abundance, drought onset, drought survival, post-drought recovery, late-stage evolution). Presented to VLMs with purely behavioral descriptions — no affect language, no framework terms, explicitly labeled as artificial systems. Framework predictions computed independently. Convergence measured via Representational Similarity Analysis (RSA) between framework-predicted and -labeled affect spaces.

Result: STRONG CONVERGENCE. GPT-4o: RSA $\rho = 0.72$ ( $p < 0.0001$ ). Claude Sonnet: $\rho = 0.54$ ( $p < 0.0001$ ). All four pre-registered predictions pass on both models:

P1: VLMs label drought onset as fear/anxiety — PASS (both: desperation, anxiety, urgency, 8/8 unanimous)
P2: VLMs label post-drought recovery as relief/hope — PASS (both: relief, cautious optimism)
P3: VLMs distinguish HIGH vs LOW late-stage — see condition summary
P4: RSA between framework and affect spaces > 0.3 — PASS (0.72 and 0.54)

Robustness check: raw numbers only. Re-ran with purely numerical descriptions (no narrative framing — just measured quantities like removal_fraction: 0.9800). Convergence increases: GPT-4o $\rho = 0.78$ , Claude $\rho = 0.72$ . This rules out narrative pattern-matching. The VLMs recognize geometric structure from raw numerical patterns — population dynamics and state update rates are sufficient.

Robustness check: basis-independence. Standard RSA standardizes each affect axis before correlating, which makes it depend on the chosen coordinate basis — so a critic can ask whether the convergence merely reflects six control-theoretic coordinates any viable controller exhibits. Re-measured with basis-independent tools (): RSA on rotation-invariant raw-Euclidean dissimilarity matrices (exactly invariant to rotations/reflections of either space, permutation null) and Gromov-Wasserstein distance. The convergence does not weaken — it strengthens. Rotation-invariant RSA rises to $\rho = 0.892$ for GPT-4o and $\rho = 0.810$ for Claude (both $p < 0.001$ , 2000-permutation null), and is unchanged to machine precision ( $|\Delta| \sim 10^{-17}$ ) when the affect space is randomly rotated; GW distance is likewise rotation-invariant. The cross-substrate alignment is a property of the relational structure, not of the coordinate choice. (The same tool applied to the much weaker within-substrate affect-to-behavior alignment of / shows the opposite: its modest standard-RSA $\rho \approx 0.07$ collapses to $\approx 0$ basis-independently — that alignment was a basis artifact, and nothing load-bearing rests on it.)

Theoretical significance: Two VLMs, trained independently on human data, with no exposure to the framework, produce affect labels that match framework geometric predictions for a system that has never encountered human affect concepts. The convergence happens because both are tapping the same underlying structure: affect geometry arises from the physics of viable self-maintenance, and human language about emotions encodes the same geometry the protocells produce.

Source code

Study record — canonical metadata, result path, status, seeds, and key finding.

— Full pipeline: vignette extraction, VLM prompting, RSA analysis
— Pre-registered experiment design