Part I: Foundations

The Self-Effect Regime

Introduction
0:00 / 0:00

The Self-Effect Regime

As a controller becomes more capable, it increasingly shapes its own environment. The observations it receives are increasingly consequences of its own actions.

The self-effect ratio quantifies this shift. For a system with policy π\policy in environment E\mathcal{E}:

ρt=I(a1:t;ot+1x0)H(ot+1x0)\rho_t = \frac{\MI(\mathbf{a}_{1:t}; \mathbf{o}_{t+1} | \mathbf{x}_0)}{\entropy(\mathbf{o}_{t+1} | \mathbf{x}_0)}

where I\MI denotes mutual information and H\entropy denotes entropy. This measures what fraction of the information in future observations is attributable to past actions. For capable agents in structured environments, ρt\rho_t increases with agent capability, and in the limit:

limcapabilityρt1\lim_{\text{capability} \to \infty} \rho_t \to 1

(bounded by the environment’s intrinsic stochasticity).

Passenger or Cause?

There is a simple way to think about ρ\rho. Imagine forking a system at time tt: same starting state, but one copy takes its normal actions while the other takes completely random ones. After kk steps, how different are their observations?

If ρ0\rho \approx 0: nearly identical observations. The system is a passenger — its actions don’t change what happens to it. Its future is determined by the environment, not by what it does.

If ρ>0\rho > 0: observations diverge. The system is a cause — what it does changes what it subsequently perceives. Its future is partly authored by itself.

This distinction turns out to be architecturally fundamental. We measured it directly in two substrates:

  • Lenia (V13–V18): ρsync0.003\rho_{\text{sync}} \approx 0.003. Patterns that evolved complex internal dynamics, memory channels, insulation fields, and directed motion — all read as passengers. Their "actions" (chemotaxis, emission) are biases on a continuous fluid governed by FFT dynamics that integrate over the full grid. Whatever a pattern does is immediately folded back into the global field. The fork barely diverges.
  • Protocell agents (V20): ρsync0.21\rho_{\text{sync}} \approx 0.21 from initialization. When an agent consumes resources at a location, that patch is depleted — its future observations there are different. When it moves, it reaches different patches. When it emits a signal, a chemical trace persists. The fork diverges because actions have consequences that return as observations.

The gap — 0.003 versus 0.21 — is not about intelligence or evolutionary history. It appeared in V20 at cycle 0, before any selection pressure. It is purely architectural: does the substrate provide a loop where actions change the world and the changed world is what the agent observes next? Lenia doesn’t. Protocell agents do.

Why does this matter for self-modeling? Because a system cannot model itself as a cause if it isn’t one. The self-model pressure — the prediction advantage described in the next section — only activates when ρ>ρc\rho > \rho_c. Below that threshold, there is nothing to model: the self is not a significant latent variable in one’s own observations.