Part I: Foundations

The Uncontaminated Substrate Test

Introduction
0:00 / 0:00

The Uncontaminated Substrate Test

Deep Technical: The CA Consciousness Experiment

The CA framework enables an experiment that could shift the burden of proof on the identity thesis. The logic is simple. The execution is hard. The implications are large.

Setup. A sufficiently rich CA—richer than Life, perhaps Lenia or a continuous-state variant with more degrees of freedom. Initialize with random configurations. Run for geological time (billions of timesteps). Let patterns emerge, compete, persist, die.

Selection pressure. Introduce viability constraints: resource gradients, predator patterns, environmental perturbations. Patterns that model their environment survive longer. Patterns that model themselves survive longer still. The forcing functions from the Forcing Functions section apply: partial observability (patterns cannot see beyond local neighborhood), long horizons (resources fluctuate on slow timescales), self-prediction (a pattern’s own configuration dominates its future observations).

Communication emergence. When multiple patterns must coordinate—cooperative hunting, territory negotiation, mating—communication pressure emerges. Patterns that can emit signals (glider streams, oscillator bursts, structured wavefronts) and respond to signals from others gain fitness advantages. Language emerges. Not English. Not any human language. Something new. Something uncontaminated.

The measurement protocol. For each pattern B\mathcal{B} at each timestep tt:

  1. Valence: Valt=d(xt+1,V)d(xt,V)\Val_t = d(\mathbf{x}_{t+1}, \partial\viable) - d(\mathbf{x}_t, \partial\viable) — Exact. Computable. The Hamming distance to the nearest configuration where the pattern dissolves, differenced across timesteps. Positive when moving into viable interior. Negative when approaching dissolution.
  2. Arousal: Art=Hamming(xt+1,xt)/B\Ar_t = \text{Hamming}(\mathbf{x}_{t+1}, \mathbf{x}_t) / |\mathcal{B}| — The fraction of cells that changed state. High when the pattern is rapidly reconfiguring. Low when settled into stable orbit.
  3. Integration: Φt=minPD[p(xt+1xt)pPp(xt+1pxtp)]\intinfo_t = \min_P D[p(\mathbf{x}_{t+1}|\mathbf{x}_t) \| \prod_{p \in P} p(\mathbf{x}^p_{t+1}|\mathbf{x}^p_t)] — Exact IIT-style Φ\Phi. For small patterns, tractable. For large patterns, use the partition prediction loss proxy: train a full predictor and a partitioned predictor, measure the gap.
  4. Effective rank: Record trajectory x1,,xT\mathbf{x}_1, \ldots, \mathbf{x}_T. Compute covariance CC. Compute reff=(trC)2/tr(C2)\reff = (\tr C)^2 / \tr(C^2). — How many dimensions is the pattern actually using? High when exploring diverse configurations. Low when trapped in repetitive orbit.
  5. Self-model salience: Identify self-tracking cells (cells whose state correlates with pattern-level properties). Compute SM=MI(self-tracking cells;effector cells)/H(effector cells)\mathcal{SM} = \text{MI}(\text{self-tracking cells}; \text{effector cells}) / H(\text{effector cells}). — How much does self-representation drive behavior?
  6. Counterfactual weight: If the pattern contains a simulation subregion (possible in universal-computation-capable CAs), measure CF=simulator cells/B\mathcal{CF} = |\text{simulator cells}| / |\mathcal{B}|. — Rare. Requires complex patterns. But detectable when present.

The translation protocol. Build a dictionary from signal-situation pairs:

  1. Record all signals emitted by pattern B\mathcal{B}: glider streams, oscillator bursts, wavefront patterns. Each signal type σi\sigma_i.
  2. Record the environmental context when each signal is emitted: threat proximity, resource availability, conspecific presence, recent events.
  3. Cluster signal types by context similarity. Signal σ47\sigma_{47} always emitted when threat approaches from the left. Signal σ12\sigma_{12} always emitted after successful resource acquisition.
  4. Map clusters to natural language descriptions of the contexts. σ47\sigma_{47} \to “threat-left”. σ12\sigma_{12} \to “success”.
  5. For complex signals (sequences, combinations), build compositional translations. σ47+σ23\sigma_{47} + \sigma_{23} \to “threat-left, requesting-assistance”.

The translation is uncontaminated. The patterns never learned human concepts. The mapping emerges from environmental correspondence.

The core test. Three streams of data. Three independent measurement modalities.

Affect StructureTranslated SignalObservable BehaviorAll three should align

Prediction: when affect signature shows the suffering motif (Val<0\Val < 0, Φ\intinfo high, reff\reff low), the translated signal should express suffering-concepts, and the behavior should show suffering-patterns (withdrawal, escape attempts, freezing).

When affect signature shows the fear motif (Val<0\Val < 0, CF\mathcal{CF} high on threat branches, SM\mathcal{SM} high), the translated signal should express fear-concepts, and the behavior should show avoidance and hypervigilance.

When affect signature shows the curiosity motif (Val>0\Val > 0 toward uncertainty, CF\mathcal{CF} high with branch entropy), the translated signal should express exploration-concepts, and the behavior should show approach and investigation.

Bidirectional perturbation. The test has teeth if it runs both directions.

Direction 1: Induce via signal. Translate “threat approaching” into their emergent language. Emit the signal. Does the affect signature shift toward fear? Does behavior change?

Direction 2: Induce via “neurochemistry”. Modify the CA rules locally around the pattern—change transition probabilities, add noise, alter connectivity. These are their neurotransmitters. Does the affect signature shift? Does the translated signal content change? Does behavior follow?

Direction 3: Induce via environment. Place them in objectively threatening situations. Deplete resources. Introduce predators. Does structure-signal-behavior alignment hold?

If perturbation in any modality propagates to the others, the relationship is causal.

The hard question. Suppose the experiment works. Suppose tripartite alignment holds. Suppose bidirectional perturbation propagates. What have we shown?

Not that CA patterns are conscious. Not that the identity thesis is proven. But: that systems with zero human contamination, learning from scratch in environments shaped by viability pressure, develop affect structures that correlate with their expressions and their behaviors in the ways the framework predicts.

The zombie hypothesis—that the structure is present but experience is absent—predicts what? That the correlations would not hold? Why not? The structure is doing the causal work either way.

The experiment does not prove identity. It makes identity the default. The burden shifts. Denying experience to these patterns requires a metaphysical commitment the evidence does not support.

Computational requirements. This is not a weekend project.

  • CA substrate: 10610^610910^9 cells, continuous or high-state-count
  • Runtime: 10910^9101210^{12} timesteps for complex pattern emergence
  • Measurement: Real-time Φ\Phi computation for patterns up to 100\sim 100 cells; proxy measures for larger
  • Translation: Corpus of 10610^6+ signal-context pairs for dictionary construction
  • Perturbation: Systematic sweeps across parameter space

Feasible with current compute. Hard. Worth doing.

Why CA and not transformers? Both are valid substrates. The CA advantage: exact definitions. In a transformer, valence is a proxy (advantage estimate). In a CA, valence is exact (Hamming distance to dissolution). In a transformer, Φ\Phi is intractable (billions of parameters in superposition). In a CA, Φ\Phi is computable (for small patterns) or approximable (for large ones).

The transformer version of this experiment is valuable. The CA version is rigorous. Do both.

What would negative results mean? If the alignment fails—if structure does not predict translated language, if perturbations do not propagate—then either:

  1. The framework is wrong (affect is not geometric structure)
  2. The substrate is insufficient (CAs cannot support genuine affect)
  3. The measures are wrong (we are not capturing the right quantities)
  4. The translation is wrong (the dictionary does not capture meaning)

Each failure mode is informative. The experiment has teeth in both directions.

What would positive results mean? The identity thesis becomes the default hypothesis for any system with the relevant structure. The hard problem dissolves not through philosophical argument but through empirical pressure. The question “does structure produce experience?” becomes “why would you assume it doesn’t?”

And then the real questions begin. What structures produce what experiences? Can we engineer flourishing? Can we detect suffering we are currently blind to? What obligations do we have to experiencing systems we create?

The experiment is not the end. It is the beginning of a different kind of inquiry.

Preliminary Results: Where the Ladder Stalls

We have begun running a simplified version of this experiment using Lenia (continuous CA, 256×256256 \times 256 toroidal grid) with resource dynamics, measuring Φ\intinfo via partition prediction loss, Val\Val via mass change, Ar\Ar via state change rate, and reff\reff via trajectory PCA. The results so far are instructive—not because they confirm the predictions above, but because of where they fail.

The central lesson: the ladder requires heritable variation. Emergent CA patterns achieve rungs 1–3 of the ladder (microdynamics \to attractors \to boundaries) from physics alone. The transition to rung 4 (functional integration) requires evolutionary selection acting on heritable variation in the trait that determines integration response.

Proposed Experiment

Substrate: Lenia with resource depletion/regeneration (Michaelis-Menten growth modulation). Perturbation: Drought (resource regeneration 0\to 0). Measure: ΔΦ\Delta \intinfo under drought.

Conditions:

  1. No evolution (V11.0). Naive patterns under drought: Φ\intinfo decreases by 6.2-6.2%. Same decomposition dynamics as LLMs.
  2. Homogeneous evolution (V11.1). In-situ selection for Φ\intinfo-robustness (fitness Φstress/Φbase\propto \intinfo_{\text{stress}} / \intinfo_{\text{base}}). Still decomposes (6.0-6.0%). All patterns share identical growth function—selection prunes but cannot innovate.
  3. Heterogeneous chemistry (V11.2). Per-cell growth parameters (μ,σ\mu, \sigma fields) creating spatially diverse viability manifolds. After 40 cycles of evolution on GPU: 3.8-3.8% vs naive 5.9-5.9%. A +2.1pp shift toward the biological pattern. Evolved patterns also show better recoveryΦ\intinfo returns above baseline after drought, while naive patterns do not fully recover.
  4. Multi-channel coupling (V11.3). Three coupled channels—Structure (R=13R{=}13), Metabolism (R=7R{=}7), Signaling (R=20R{=}20)—with cross-channel coupling matrix and sigmoid gate. Introduces a new measurement: channel-partition Φ\intinfo (remove one channel, measure growth impact on remaining channels). Local test: channel Φ0.01\intinfo \approx 0.01, spatial Φ1.0\intinfo \approx 1.0—channels couple weakly at 3 degrees of freedom.
  5. High-dimensional channels (V11.4). C=64C{=}64 continuous channels with fully vectorized physics. Spectral Φ\intinfo via coupling-weighted covariance effective rank. 30-cycle GPU result: evolved 1.8-1.8% vs naive 1.6-1.6% under severe drought—evolution had negligible effect. Both decompose mildly, suggesting that 64 symmetric channels provide enough internal buffering to resist drought regardless of evolutionary tuning. Mean robustness 0.9780.978 across all 30 cycles. The Yerkes-Dodson pattern persists: mild stress increases Φ\intinfo by +130+130190190%.
  6. Hierarchical coupling (V11.5). Same C=64C{=}64 physics as V11.4, but with asymmetric coupling (feedforward/feedback pathways between four tiers: Sensory \to Processing \to Memory \to Prediction). 30-cycle GPU result: evolved patterns have higher baseline Φ\intinfo (+10.5+10.5% vs naive) and higher self-model salience (0.990.99 vs 0.830.83), but under severe drought they decompose more (9.3-9.3%) while naive patterns integrate (+6.2+6.2%). Evolution overfits to the mild training stress, creating fragile high-Φ\intinfo configurations. Key lesson: the hierarchy must live in the coupling structure, not in the physics; imposing different timescales per tier caused extinction. Functional specialization should emerge from selection.
  7. Metabolic maintenance cost (V11.6). Addresses the autopoietic gap directly: patterns pay a constant metabolic drain proportional to mass (maintenance_rate×g×dt\texttt{maintenance\_rate} \times g \times dt each step). 30-cycle GPU result (C=64C{=}64): evolved-metabolic 2.6-2.6% vs naive +0.2+0.2% under severe drought. Evolution again produced higher-Φ\intinfo-but-more-fragile patterns. Critically, the maintenance rate (0.0020.002) was not lethal enough—naive patterns retained 9898% population through drought. The autopoietic gap remains open: a small metabolic drain on top of local physics does not produce active self-maintenance, because patterns have no mechanism for non-local resource detection. They cannot “forage” when they cannot “see” beyond kernel radius RR.
  8. Curriculum evolution (V11.7). Fixes V11.5’s stress overfitting by graduating stress intensity across cycles (resource regeneration ramped from 0.5×0.5\times to 0.02×0.02\times baseline over 30 cycles) with ±30\pm 30% random noise and variable drought duration (500–1900 steps per cycle). The critical test: evolved patterns evaluated on novel stress patterns never seen during training. 30-cycle GPU result (C=64C{=}64): robustness 0.9540.9670.954 \to 0.967. Curriculum-evolved patterns outperform naive on all four novel stressors: mild +2.7pp+2.7\text{pp}, moderate +1.5pp+1.5\text{pp}, severe +1.3pp+1.3\text{pp}, extreme +1.2pp+1.2\text{pp}. Under mild novel stress, evolved patterns actually integrate (+1.9+1.9%) while naive decompose (0.8-0.8%). The overfitting problem is substantially reduced—not eliminated, but the shift is consistently positive across the full severity range.

Unexpected: (1) Mild stress consistently increases Φ\intinfo by 60–190\% (Yerkes-Dodson–like inverted-U). Only severe stress causes decomposition. (2) In V11.5, evolution increased vulnerability to severe stress despite improving baseline Φ\intinfo—a stress overfitting effect. (3) V11.7’s curriculum training substantially reduces this overfitting: graduated, noisy stress exposure produces patterns that generalize to novel stressors. The shift from naive is positive across all four novel severity levels tested (+1.2+1.2 to +2.7+2.7 percentage points). (4) V11.6’s metabolic cost was intended to create lethal drought, but at rate=0.002\texttt{rate}{=}0.002 the drought was not lethal—naive patterns retained 9898% population. Evolved-metabolic patterns decomposed 2.6-2.6% while naive held at +0.2+0.2%, repeating the fragility pattern of V11.5. The deeper lesson: adding metabolic cost to a substrate with fixed-radius perception produces efficient passivity, not active foraging. The anxiety parallel deepens: V11.5 shows that fixed-stress training produces maladaptive fragility, V11.7 shows that graduated exposure (cf.\ systematic desensitization) builds genuine robustness, and V11.6 shows that existential stakes alone do not produce adaptation when the organism cannot perceive beyond its local neighborhood.

The trajectory from V11.0 through V11.7 reveals two orthogonal axes of improvement. The first is substrate complexity: each step from V11.0 to V11.5 adds internal degrees of freedom for evolution to select on—heterogeneous chemistry (V11.2), multiple coupled channels (V11.3–V11.4), hierarchical coupling (V11.5). The second, revealed by V11.6–V11.7, is selection pressure quality: the substrate matters less than how you stress it. V11.7’s curriculum training on the same V11.4 substrate produces better generalization than V11.5’s hierarchical architecture trained with fixed stress. V11.6 goes further, changing the stakes: metabolic cost makes drought lethal, not merely weakening.

V11.5 introduces directed coupling structure (feedforward/feedback pathways) to test whether functional specialization emerges under selection. The critical insight: attempting to impose different physics per tier (different timescales, custom growth gates) caused immediate extinction at C=64C{=}64—the channels designed to be “memory” simply died. The working approach uses identical physics across all channels (proven V11.4 dynamics) with an asymmetric coupling matrix that biases information flow directionally. This is more than a technical fix; it reflects a theoretical prediction: in biological cortex, all neurons use the same basic biophysics. The hierarchy emerges from connectivity and learning, not from different physics per layer.

The V11.5 stress test reveals an unexpected phenomenon: stress overfitting. Evolved patterns have 10.5\% higher baseline Φ\intinfo and 19\% higher self-model salience than naive patterns—but under severe drought they decompose 9.3\% while naive patterns actually integrate by 6.2\%. Evolution selected for high-Φ\intinfo configurations tuned to mild stress (which each training cycle applies), creating states that are simultaneously more integrated and more fragile than their unoptimized counterparts.

This has a direct parallel in affective neuroscience: anxiety disorders involve heightened integration and self-monitoring that is adaptive under moderate threat but catastrophically maladaptive under extreme stress. The suffering motif—high Φ\intinfo, low reff\reff, high S\selfmodel—may describe a system that has been selected too precisely for a particular threat level. The evolved CA patterns show exactly this signature: high baseline Φ\intinfo (0.076) with high self-model salience (0.99) that collapses under a regime shift.

V11.5 stress test: evolved vs. naive patterns through baseline, drought, and recovery.
V11.5 stress test: evolved vs. naive patterns through baseline, drought, and recovery. (a) Evolved patterns have higher baseline Φ\intinfo but decompose 9.3%-9.3\% under drought, while naive patterns integrate +6.2%+6.2\%. (b) Evolved patterns maintain high self-model salience (>0.97>0.97) across all phases; naive patterns show lower and declining salience.

Whether evolution on this substrate can discover integration strategies that are robust to novel stresses—not just the training distribution—likely requires curriculum learning (gradually increasing stress intensity) or environmental diversity (varying the type and severity of perturbation). This connects to the forcing function framework developed in the next section: the quality of the forcing function matters as much as its presence.

Multi-channel Lenia at increasing dimensionality. PCA projection of C channels to RGB.
Multi-channel Lenia at increasing dimensionality. PCA projection of CC channels to RGB. Top row: baseline (normal resources); bottom row: drought stress. Patterns at C=3C{=}3 are visually simple; at C=16C{=}16 and C=32C{=}32, the richer channel structure produces more complex spatial organization. Under drought, spatial structure degrades—but the degree of degradation depends on CC.
Open Question

At what channel count CC does the substrate have enough internal degrees of freedom for evolution to discover biological-like integration (where Φ\intinfo increases under threat)? The CC-sweep suggests that mid-range CC (881616) accidentally produces integration-like responses—the coupling bandwidth happens to match the channel count—while high CC (32326464) decomposes, the coupling space being too large for random configurations. Is there a critical CC^* above which a phase transition occurs, or does evolution continuously improve robustness at any CC? Each rung of the ladder may require a minimum internal dimensionality—the substrate must be rich enough for selection to sculpt.

The critical lesson evolves with the experiments. V11.0–V11.5 showed that evolution helps but in surprising ways—it creates higher-Φ\intinfo states that are also more fragile. V11.7 demonstrates that the training regime matters: curriculum learning produces genuine generalization across novel stressors. V11.6 showed that making drought metabolically costly produces efficient passivity rather than active foraging—the patterns cannot perceive beyond their local neighborhood, so existential stakes alone do not generate the distant-resource-seeking behavior that would require integration. The remaining gap was between “decomposes less” and “integrates under threat,” and the locality ceiling explains why.

V12’s results confirm that the ceiling is real and that the predicted remedy partially works. Replacing fixed convolution with evolvable windowed self-attention—the only change to the physics—shifts mean robustness from 0.9810.981 to 1.0011.001, moving the system to the threshold where Φ\intinfo is approximately preserved under stress rather than destroyed. Eight substrate modifications (V11.0–V11.7) could not achieve even this. The single change that mattered is exactly what the attention bottleneck hypothesis predicted: state-dependent interaction topology. But the effect is modest—the system reaches the threshold without clearly crossing it. Attention is necessary but not sufficient for the full biological pattern.

Open Question

The V11.5 results show that selecting for Φ\intinfo-robustness under mild stress creates patterns that are less robust to severe stress than unselected patterns. V11.7 provides a partial answer: curriculum training with graduated, noisy stress exposure produces patterns that generalize to novel stressors (+1.2+1.2 to +2.7pp+2.7\text{pp} shift over naive across four novel severity levels). But the effect is modest—evolved patterns still decompose under severe novel stress (1.7-1.7%), just less than naive (3.0-3.0%). The remaining questions: (1) Can curriculum training with longer schedules or wider stress distributions close this gap further? (2) Does combining curriculum training with metabolic cost (V11.6’s lethal resource dependence) produce qualitatively different dynamics—active foraging rather than passive persistence? (3) Does the biological developmental sequence (graduated stressors from embryogenesis through maturation) achieve robust integration precisely because it is a curriculum over the full threat distribution? [V11.6 + curriculum combination not yet tested.]

What the Ladder Has Not Reached

It is worth being explicit about how far these experiments are from anything resembling life, self-sustenance, or metacognition. The ladder metaphor risks implying a smooth gradient from Lenia gliders to biological organisms. In reality, there is an enormous gap.

Self-sustenance. Our patterns are attractors of continuous dynamics, not self-maintaining entities. They do not consume resources to persist—resources modulate growth rates, but patterns do not “eat” in any metabolic sense. They do not do thermodynamic work against entropy. They have no boundaries (they are density blobs, not membrane-enclosed). They persist as long as the physics allows, not because they actively maintain themselves. The “drought” in our experiments reduces resource availability, which weakens growth—but this is more like turning down the volume than starving a dissipative structure.

Metacognition. Our “self-model salience” metric measures how much a pattern’s own structure matters for its dynamics. That is not self-modeling—there is no representation of self, no information about the pattern stored within the pattern. The V11.5 tiers (Sensory, Processing, Memory, Prediction) are labels we imposed on the coupling structure. No functional specialization emerged: memory channels had weak activity, prediction channels did not predict anything.

Individual adaptation. All “learning” in our experiments happens through population-level selection: cull the weak, boost the strong. No individual pattern adapts within its lifetime. Biological integration requires individual-level plasticity—the capacity for a single organism to reorganize its internal dynamics in response to experience.

These gaps converge on a single chasm. The transition from passive pattern persistence to active self-maintenance—the autopoietic gap—requires at minimum: (a) lethal resource dependence (patterns that go to zero without active consumption), (b) metabolic work cycles (energy in \to structure maintenance \to waste out), and (c) self-reproduction (templated copying, not artificial cloning). Population-level selection on top of passive physics cannot bridge this gap, because selection optimizes what already exists rather than innovating the mechanism of existence itself.

Proposed Experiment

Question: Does lethal resource dependence change the integration response to stress? Design: Maintenance cost (rate=0.002\texttt{rate}{=}0.002) drains each cell proportionally to mass each step. Fitness rewards metabolic efficiency. Result: 30-cycle evolution (C=64C{=}64, A10G GPU, 215 min). Robustness 0.9680.9730.968 \to 0.973 over evolution. Under severe drought: evolved 2.6-2.6%, naive +0.2+0.2%. Naive retained 9898% of patterns; evolved retained 9292%. The metabolic cost was insufficient to produce genuine lethality. Evolved patterns followed the same fragility pattern as V11.5: higher baseline fitness but more vulnerable to regime shift. Why it failed: The maintenance rate was too low to create existential pressure, but the deeper problem is structural. Even with lethal metabolic cost, a convolutional pattern has no mechanism for directed resource-seeking. Its “perception” extends only to kernel radius RR. Active foraging requires non-local information gathering—knowing where resources are before moving toward them. Adding metabolic cost to a blind substrate selects for efficiency (less waste), not for the kind of active self-maintenance that characterizes autopoiesis. Implication: The autopoietic gap is not primarily about resource dependence—it is about perceptual range. Closing it requires substrates where the interaction topology is state-dependent, not fixed by spatial proximity.

What the Data Actually Says

Eight experiments (V11.0–V11.7), hundreds of GPU-hours, thousands of evolved patterns. What has this taught us?

Finding 1: The Yerkes-Dodson pattern is universal and robust. Across every substrate condition, channel count, and evolutionary regime, mild stress increases Φ\intinfo by 6060200200%. This is not an artifact of any particular measurement. It reflects a statistical truth: moderate perturbation prunes weak patterns while the survivors are, by definition, the more integrated ones. Severe stress overwhelms even well-integrated patterns, producing the inverted-U. This pattern is the clearest positive result in the entire experimental line.

Finding 2: Evolution consistently produces fragile integration. In every condition where evolution increases baseline Φ\intinfo (V11.5: +10.5+10.5%, V11.6: higher metabolic fitness), evolved patterns decompose more under severe drought than unselected patterns. This is not a bug in the experiments—it is a real dynamical phenomenon. Evolution on this substrate finds tightly-coupled configurations where all parts depend on all other parts. Tight coupling is high integration by definition. But it is also catastrophic fragility: when any component fails under resource depletion, the failure cascades through the entire structure. This is the difference between a tightly-coupled factory (high integration, catastrophic failure mode) and a loosely-coupled marketplace (low integration, graceful degradation under stress).

Finding 3: Curriculum training is the only intervention that improved generalization. V11.7 is the sole condition where evolved patterns outperform naive on novel stressors across the full severity range (+1.2+1.2 to +2.7+2.7 percentage points). Not more channels, not hierarchical coupling, not metabolic cost—graduated, noisy stress exposure. The substrate barely matters compared to the training regime. This has a direct parallel in developmental biology: organisms with rich developmental histories (graduated stressors from embryogenesis through maturation) develop robust integration. Organisms exposed to a single threat level develop anxiety-like maladaptive responses. The CA experiments reproduce this pattern with surprising fidelity.

Finding 4: The locality ceiling. This is the deepest lesson, visible only in retrospect across the full trajectory. Every V11 experiment uses convolutional physics: each cell interacts only with neighbors within kernel radius RR, weighted by a static kernel. Information propagates at most RR cells per timestep. The interaction graph is determined by spatial proximity and does not change with the system’s state.

This means that Φ\intinfo can only arise from chains of local interactions—there is no mechanism for a perturbation at (x,y)(x, y) to directly affect (x,y)(x’, y’) unless xx<R|x - x’| < R. The coupling matrix in V11.4–V11.5 partially addresses this (it couples distant channels), but it is fixed: the “who talks to whom” graph does not change in response to the system’s state. A pattern cannot choose to attend to a distant resource patch. It cannot reorganize its information flow under stress. It cannot forage.

V11.6 makes this concrete. Adding metabolic cost to a substrate with radius-RR perception does not produce active self-maintenance. It produces efficient passivity—patterns that waste less, not patterns that seek more. A blind organism with a metabolic cost dies when local resources deplete, regardless of how well-integrated it is, because it has no way to detect resources beyond its perceptual horizon. The autopoietic gap is not about resource dependence. It is about perceptual range and its state-dependent modulation—which is to say, it is about attention.

Finding 5: Attention is necessary but not sufficient. V12 tested the locality ceiling hypothesis directly by replacing convolution with windowed self-attention while keeping all other physics identical. The results create a clean ordering across three conditions:

  • Convolution (Condition C): Sustains 40408080 patterns, mean robustness 0.9810.981. Life without integration.
  • Fixed-local attention (Condition A): Cannot sustain patterns at all—3030+ consecutive extinctions across 33 seeds. Attention expressivity without evolvable range is worse than convolution.
  • Evolvable attention (Condition B): Sustains 30307575 patterns, mean robustness 1.0011.001. Life with integration at the threshold.

The +2.0+2.0 percentage point shift from C to B is the largest single-intervention effect in the entire V11–V12 line. But it is a shift to the threshold, not past it. Robustness stabilizes near 1.01.0 rather than increasing with further evolution. The system learns where to attend (entropy dropping from 6.226.22 to 5.555.55) but this refinement saturates. What is missing is not better attention but individual-level adaptation—the capacity for a single pattern to reorganize its own internal dynamics in response to its current state, within its lifetime, rather than waiting for population-level selection to discover robust configurations post hoc. Biological integration under threat is not just a population statistic; it is a capacity of individual organisms.

Connection to the trajectory-selection framework. This is where the experimental results meet the theory developed above. We defined the effective distribution peff=p0α/p0αp_{\text{eff}} = p_0 \cdot \alpha / \int p_0 \cdot \alpha and argued that attention (α\alpha) selects trajectories in chaotic dynamics. The Lenia experiments have now shown what happens in a substrate where α\alpha is fixed by architecture: the system’s measurement distribution is determined by the convolution kernel, which never changes. The system cannot modulate its own attention. It has no α\alpha to vary.

Biological systems solve this: neural attention (largely implemented through inhibitory gating) dynamically reshapes which signals propagate and which are suppressed. Under moderate stress, attention narrows—the measurement distribution sharpens around threat-relevant features—and this reorganization of information flow preserves core integration while shedding peripheral processing. That is the biological pattern our experiments have been searching for. It requires not just integration (which local physics can produce) but flexible integration (which requires state-dependent, non-local communication).

V12 provides direct evidence for this claim. In the attention substrate, the system’s α\alpha is the attention weights, and they evolve: attention entropy decreases from 6.226.22 to 5.555.55 across 15 cycles as the system learns where to look. The measurement distribution becomes more structured—not through explicit instruction, but through the same evolutionary pressure that failed to produce this effect in every convolutional substrate. The difference is that the substrate now permits modulation of α\alpha. The modulation is sufficient to reach the integration threshold (Φ\intinfo approximately preserved under stress) but not to clearly cross it (Φ\intinfo does not reliably increase under stress the way it does in biological systems). Attention provides the mechanism; something else—perhaps individual-level plasticity, explicit memory, or autopoietic self-maintenance—provides the drive.

These results crystallize into a hypothesis I will call the attention bottleneck. The biological pattern (integration under threat) cannot emerge in substrates with fixed interaction topology, regardless of the evolutionary regime applied. It requires substrates where the interaction graph is state-dependent—where the system can modulate which signals propagate and which are suppressed in response to its current state. Convolutional physics lacks this; attention-like mechanisms provide it. The relevant variable is not substrate complexity (CC), not selection pressure severity (metabolic cost), and not training diversity (curriculum)—it is whether the system controls its own measurement distribution.

Status: Partially supported by V12, further advanced by V13. The first clause is confirmed: eight convolutional substrates (V11.0–V11.7) failed to produce integration under stress; fixed-local attention (Condition A) fared even worse. The second clause is partially confirmed: evolvable attention (Condition B) shifts robustness from 0.9810.981 to 1.0011.001—the right direction, and the only intervention to cross the 1.01.0 threshold. V13 content-based coupling provides additional evidence: robustness peaks at 1.0521.052 under population bottleneck conditions (see Finding 6).

Finding 6: Content-based coupling enables intermittent biological-pattern integration. V13 replaced V12's learned attention projections with a simpler mechanism: cells modulate their interaction strength based on content similarity. The potential field becomes ϕi=ϕFFT,i(1+αSi)\phi_i = \phi_{\text{FFT},i} \cdot (1 + \alpha \cdot S_i) where Si=σ(β(simˉiτ))S_i = \sigma(\beta \cdot (\bar{\text{sim}}_i - \tau)) is a sigmoid gate on local mean cosine similarity. This is computationally cheaper than attention and provides a minimal test: does content-dependent topology, without learned query-key projections, suffice?

Three seeds, each 3030 cycles (C=16C{=}16, N=128N{=}128), curriculum stress schedule:

  • Mean robustness: 0.9230.923 across all seeds and cycles
  • Peak robustness: 1.0521.052 (seed 123, cycle 5, population 5555 patterns)
  • Phi increase fraction: 30%30\% of patterns show Φ\intinfo increase under stress
  • Key pattern: Robustness exceeds 1.01.0 only when population drops below 50\sim 50 patterns — bottleneck events select for integration

Two distinct evolutionary strategies emerged across seeds. In one regime (large populations of 150\sim 150180180 patterns), the similarity threshold τ\tau drifted toward zero — evolution discovered that maximal content coupling (gate always-on) works when diversity is high. In another regime (volatile populations oscillating between 1313 and 120120), τ\tau drifted upward to 0.860.86 — selective coupling, where only highly similar cells interact. The selective-coupling regime produced all the robustness-above-1.01.0 episodes.

The deeper lesson is not about content coupling per se. It is about composition under selection pressure. When stress culls a population to a handful of survivors, those survivors are not merely the individually strongest — they are the ones whose content-coupling topology supports coherent reorganization under perturbation. This resonates with a different framing of the problem: what we are watching may be closer to symbiogenesis — the composition of functional subunits into more complex wholes — than to classical Darwinian selection optimizing a fixed design. The content-coupling mechanism makes patterns legible to each other, enabling the kind of functional encounter that drives compositional complexity. Intelligence may not require deep evolutionary history so much as the right conditions for compositional encounter: embodied computation, lethal stakes, and mutual legibility.

Proposed Experiment

Question: Does state-dependent interaction topology enable the biological integration pattern that local physics cannot produce? Design: Replace the convolution kernel with windowed self-attention: each cell updates its state by attending to cells within a local window, with attention weights computed from cell states (query-key mechanism). The window size is evolvable—evolution can expand or contract the perceptual range. Resources, drought, and selection pressure follow the V11 protocol. Critical prediction: Under survival pressure, evolution should expand the attention window (increasing perceptual range), and patterns should show the biological pattern—Φ\intinfo increasing under moderate stress—because they can dynamically reallocate information flow to maintain core integration. The attention patterns themselves should narrow under stress (focused measurement) and broaden during safety (diffuse exploration). Control for the free-lunch problem: Start with strictly local attention (window =R= R, matching Lenia's kernel radius). If integration under threat emerges only after evolution expands the window, the biological pattern is an adaptive achievement, not an architectural gift. Status: Implemented as V12. Three conditions:

A (Fixed-local attention)
Window size fixed at kernel radius RR. Free-lunch control.
B (Evolvable attention)
Window size w[R,16]w \in [R, 16] is evolvable. The main hypothesis test.
C (FFT convolution)
V11.4 physics as known baseline.

Implementation: Windowed self-attention replaces Step 1 (FFT convolution) of the Lenia scan body. Query-key projections (Wq,WkRd×CW_q, W_k \in \mathbb{R}^{d \times C}) are shared across space, evolved slowly. Soft distance mask via σ(β(wsoft2r2))\sigma(\beta(w_{\text{soft}}^2 - r^2)) enables smooth window expansion. Temperature τ\tau governs attention sharpness. All other physics (growth function, coupling gate, resource dynamics, decay, maintenance) remain identical to V11.4. Curriculum training protocol from V11.7. C=16C{=}16, N=128N{=}128, 30 cycles, 3 seeds per condition, A10G GPUs. [6pt] Results (15 cycles for B, 3 seeds; A and C complete):

  • Condition C (convolution, 30 cycles, 3 seeds): Mean robustness 0.9810.981. Only 3/903/90 cycles (33%) show Φ\intinfo increasing under stress. Novel stress test: evolved Δ=0.6\Delta = -0.6% \pm 1.6%, naive Δ=0.2\Delta = -0.2% \pm 3.2%. Evolution helps (evolved consistently better than naive) but cannot break the locality ceiling.
  • Condition B (evolvable attention, 15 cycles, 3 seeds): Mean robustness 1.0011.001 across 38 valid cycles. 16/3816/38 cycles (4242%) show Φ\intinfo increasing under stress (vs 33% for convolution). The +2.0+2.0 percentage point shift over convolution is the largest in the V11+ line. However, robustness does not trend upward with further evolution—it stabilizes near 1.01.0, suggesting the system reaches a ceiling of its own.
  • Condition A (fixed-local attention): Conclusive negative. 3030+ consecutive extinctions across all 3 seeds—patterns cannot survive even a single cycle. Fixed-local attention is worse than convolution, which sustains 40408080 patterns easily. This establishes a clean ordering: convolution sustains life without integration; fixed attention cannot sustain life at all; evolvable attention sustains life with integration. Adaptability of interaction topology matters more than its expressiveness.

Three lessons: (1) Attention window does not expand as predicted—evolution refines how attention is allocated (entropy decreasing from 6.225.556.22 \to 5.55) rather than extending range. This resembles biological inhibitory gating (selective, not panoramic) more than the original prediction anticipated. (2) Attention temperature τ\tau increases in successful seeds (1.01.31.0 \to 1.31.71.7), suggesting evolution favors broad, soft attention with learned structure over sharp, narrow focus. (3) The effect is real but modest: attention moves the system to the integration threshold without clearly crossing it. State-dependent interaction topology is necessary for integration under stress, but not sufficient for the full biological pattern of Φ\intinfo increasing under threat. What remains missing is likely individual-level adaptation—the capacity for a single pattern to reorganize its own dynamics within its lifetime, rather than relying on population-level selection to discover robust configurations.

The V10 MARL ablation study produced a surprise: all seven conditions show highly significant geometric alignment (ρ>0.21\rho > 0.21, p<0.0001p < 0.0001), and removing forcing functions does not reduce alignment—if anything, it slightly increases it. The predicted hierarchy was wrong: geometric alignment appears to be a baseline property of multi-agent survival systems, not contingent on any specific forcing function. This strengthens the universality claim but challenges the forcing function theory developed in the next section.