Part II: Identity Thesis

Counterfactual Weight

Introduction
0:00 / 0:00

Counterfactual Weight

Where the previous dimensions captured the system’s current state, counterfactual weight captures its temporal orientation—how much processing is devoted to possibilities rather than actualities. Let R\mathcal{R} be the set of imagined rollouts (counterfactual trajectories) and P\mathcal{P} be present-state processing. Then:

CFt=Computet(R)Computet(R)+Computet(P)\mathcal{CF}_t = \frac{\text{Compute}_t(\mathcal{R})}{\text{Compute}_t(\mathcal{R}) + \text{Compute}_t(\mathcal{P})}

The fraction of computational resources devoted to modeling non-actual possibilities.

In model-based RL:

CFt=τrolloutsw(τ)H[τ]wherew(τ)V(τ)\mathcal{CF}_t = \sum_{\tau \in \text{rollouts}} w(\tau) \cdot \entropy[\tau] \quad \text{where} \quad w(\tau) \propto |V(\tau)|

Rollouts weighted by their value magnitude and diversity.

Phenomenal Correspondence

High counterfactual weight: Mind is elsewhere—planning, worrying, fantasizing, anticipating. Low counterfactual weight: Present-focused, reactive, in-the-moment.

This is where the reactivity/understanding distinction (Part VII) becomes experientially salient. Low CF is reactive experience: the system runs on present-state associations, its processing decomposable by channel. High CF is understanding: the system holds multiple possible futures simultaneously, and the quality of that holding — which possibilities, how they are compared, what actions they recommend — is inherently non-decomposable. The experience of weighing options is not reducible to separate valuations of each option. The comparison itself is the experience.

Counterfactual Weight in Discrete Substrate

For most CA patterns: CF=0\mathcal{CF} = 0. They follow their dynamics without simulation.

But Life contains universal computers—patterns that can simulate arbitrary computations, including Life itself. Imagine a pattern B\mathcal{B} containing:

  • A simulator subregion that runs a model of possible futures
  • A controller that adjusts behavior based on simulator output

Then:

CF=simulator cellsB\mathcal{CF} = \frac{|\text{simulator cells}|}{|\mathcal{B}|}

The fraction of the pattern devoted to counterfactual reasoning.

Such patterns are rare and complex—universal computation requires many cells. But they should outperform simple patterns: they can anticipate threats (fear structure) and identify opportunities (desire structure). The prediction: patterns with CF>0\mathcal{CF} > 0 survive longer in hostile environments.