The Geometry of Affect
The Geometry of Affect
The geometric theory of affect developed here builds on and extends established dimensional models:
- Russell’s Circumplex Model (1980): Two-dimensional (valence arousal) organization of affect. Extended here with additional structural dimensions (integration, effective rank, counterfactual weight, self-model salience) invoked as needed.
- Watson \& Tellegen’s PANAS (1988): Positive/Negative Affect Schedule. Valence here corresponds to their hedonic axis.
- Scherer’s Component Process Model (2009): Emotions as synchronized changes across subsystems. The integration measure captures this synchronization.
- Barrett’s Constructed Emotion Theory (2017): Emotions as constructed from core affect + conceptual knowledge. The framework here specifies the structural basis of the construction.
- Damasio’s Somatic Marker Hypothesis (1994): Body states guide decision-making. The valence definition (gradient on viability manifold) is the mathematical formalization.
Affects as Structural Motifs
If different experiences correspond to different structures, then affects—the qualitative character of emotional/valenced states—should correspond to particular structural motifs: characteristic patterns in the cause-effect geometry. An affect is what it is because of how it relates to other possible affects. Joy is defined by its structural distance from suffering, its similarity to curiosity along certain axes, its opposition to boredom along others. The Yoneda insight applies: if you know how an affect relates to every other possible state, you know the affect. There is nothing left to characterize.
The affect space is a geometric space whose points correspond to possible qualitative states. Its dimensionality is not fixed in advance. Rather than asserting a universal coordinate system, recurring structural features useful for characterizing and comparing affects are identified — features without which specific affects would not be those affects. Different affects invoke different subsets. The list is open-ended.
These measures are coordinates on the relational structure, not the structure itself. The relational structure is what the Yoneda characterization captures: the full pattern of similarities and differences between affects. The measures below are projections—tools for reading out particular aspects of that structure. Measuring valence tells you where an affect sits along the viability gradient; measuring integration tells you how unified it is. Neither alone captures the affect. Together, they triangulate a position in a space whose intrinsic geometry is defined by the similarity relations, not by the coordinates. New coordinates can be added when the existing ones fail to distinguish affects that are experientially distinct.
The recurring measures are best read as five plus one, not as a fixed six. Five describe the system's relationship to the world it is modeling — valence, arousal, integration, effective rank, counterfactual weight. The sixth, self-model salience, is categorically different: it is one member of a larger entity-directed salience family, the special case in which the indexed entity is the self. The same family includes salience directed at known others, and resolves further into salience toward a particular child, a particular rival, a particular institution. Repeating "the six dimensions" obscured this — self-salience is not a sui generis axis but the diagonal of an entity-indexed field, the same field that the ascription axis ranges over. The count is not the content; the structure is.
One methodological caution, owed because the empirical program leans on it. Representational-similarity comparisons computed in this coordinate basis are basis-dependent: standard RSA standardizes each axis before correlating, so choosing a different projection moves the alignment numbers. A universality claim that rests on rank-correlations in one hand-picked basis is therefore weaker than it looks — it may partly reflect having chosen control-theoretic coordinates that any viable controller will trivially exhibit. The relational object the coordinates project is what the claim is really about, and it must be measured with basis-independent tools before "the geometry is everywhere" can carry weight.
That test has now been run (Appendix; analysis in affect_gw_alignment.py), with a clean and discriminating result. The basis-independent measures are RSA on rotation-invariant raw-Euclidean dissimilarity matrices (exactly invariant to rotations and reflections of either space, significance by permutation) and Gromov-Wasserstein distance. They cut two ways. The weak within-substrate alignment between a system's affect coordinates and its behavior — modest already under standard RSA () — collapses to zero basis-independently (invariant , significant in 3% of snapshots): that alignment was largely a basis artifact, and the book does not lean on it. But the load-bearing evidence — the cross-substrate convergence in which vision-language models trained only on human affect independently recognize the geometry of uncontaminated synthetic agents — survives and strengthens: standard RSA – rises to a rotation-invariant – ( by permutation), with the invariant figure unchanged to machine precision under random rotation of the affect space. The relational structure, not the chosen coordinates, is what the two substrates share. So the honest statement of the universality result is sharpened, not weakened: it holds basis-independently where it is load-bearing (cross-substrate), and is correctly abandoned where it was an artifact (within-substrate affect-to-behavior).
A consequence that must be stated before any coordinates are written down, because the rest of the book draws radar charts and speaks of "distances" that could mislead. The affect space is not a flat Euclidean vector space, and its metric is not symmetric. Two of the framework's own commitments forbid the flat picture. First, curvature: the eigenskeleton has holonomy (the coupling axis is that curvature), so the honest distance between two affects is a geodesic along a curved manifold, not a straight line through — at high the modes rotate into one another and a path that looks short in coordinates is long on the manifold. Second, directionality: the transition from one state to another is not reversible at equal cost. Fear slides into anger far more easily than anger relaxes back into fear; grief does not run backward into the coupling that preceded the loss; a system tipped from joy into suffering does not retrace the same path out. The phenomenological “distance” from to differs from to , which a symmetric metric cannot represent. The correct object is therefore an asymmetric divergence on a curved manifold (a Finsler-like or Bregman-like quantity, where forward and reverse costs are computed separately), not a Euclidean distance. The coordinate vectors and the Euclidean readouts used throughout — the radar charts, the standardized RSA, the motif tables — are a local linearization: a chart in the tangent space around a point, adequate for measuring nearby differences and for the basis-independent structural comparisons above, but not the global geometry. Where this book writes a distance, read it as a local chart of an asymmetric, curved structure whose full metric is an open formal problem the dimensional toolkit only approximates.
The following structural measures recur across many affects. Not all are relevant to every phenomenon:
- Valence ()
- Gradient alignment on the viability manifold. Nearly universal—most affects have valence.
- Arousal ()
- Rate of belief/state update. Distinguishes activated from quiescent states.
- Integration ()
- Irreducibility of cause-effect structure. Constitutive for unified vs. fragmented experience.
- Effective Rank ()
- Distribution of active degrees of freedom. Constitutive when the contrast between expansive and collapsed experience matters.
- Counterfactual Weight ()
- Resources allocated to non-actual trajectories. Constitutive for affects defined by temporal orientation (anticipation, regret, planning).
- Self-Model Salience (, split into and )
- How the self figures in processing — as object of attention () and as driver of action (). The two dissociate (flow is low-attention, high-causal). Constitutive for self-conscious emotions and their opposites. The diagonal of the entity-directed salience field.
Valence: Gradient Alignment
Let be the system’s viability manifold and let be the current state. Let be the predicted trajectory under current policy. Then valence measures the alignment of that trajectory with the viability gradient:
where is the distance to the viability boundary. Positive valence means the predicted trajectory moves into the viable interior; negative valence means it approaches the boundary.
In RL terms, this becomes the expected advantage of the current action—how much better (or worse) it is than the average action from this state:
Beyond valence itself, its rate of change carries structural information. The derivative of integrated information along the trajectory,
tracks whether structure is expanding (positive ) or contracting (negative).
Positive valence corresponds to trajectories descending the free-energy landscape, expanding affordances, moving toward sustainable states. Negative valence corresponds to trajectories ascending toward constraint violation, contracting possibilities.
Arousal: Update Rate
Arousal measures how rapidly the system is revising its world model. The natural formalization is the KL divergence between successive belief states:
In latent-space models, this can be approximated more directly:
High arousal: Large belief updates, far from any attractor, system actively navigating. Low arousal: Near a fixed point, low surprise, system at rest in a basin.
Integration: Irreducibility
As defined in Part I:
Or using proxies:
High integration: The experience is unified; its parts cannot be separated without loss. Low integration: The experience is fragmentary or modular.
Effective Rank: Concentration vs. Distribution
The dimensionality of a system’s active representation can be quantified through the effective rank of its state covariance :
When , all variance is concentrated in a single dimension—the system is maximally collapsed. When , variance distributes uniformly across all available dimensions—the system is maximally expanded.
High rank: Many degrees of freedom active; distributed, expansive experience. Low rank: Collapsed into narrow subspace; concentrated, focused, or trapped experience.
Counterfactual Weight
Where the previous dimensions captured the system’s current state, counterfactual weight captures its temporal orientation—how much processing is devoted to possibilities rather than actualities. Let be the set of imagined rollouts (counterfactual trajectories) and be present-state processing. Then:
The fraction of computational resources devoted to modeling non-actual possibilities.
In model-based RL:
Rollouts weighted by their value magnitude and diversity.
High counterfactual weight: Mind is elsewhere—planning, worrying, fantasizing, anticipating. Low counterfactual weight: Present-focused, reactive, in-the-moment.
This is where the reactivity/understanding distinction (Empirical Appendix) becomes experientially salient. Low CF is reactive experience: the system runs on present-state associations, its processing decomposable by channel. High CF is understanding: the system holds multiple possible futures simultaneously, and the quality of that holding — which possibilities, how they are compared, what actions they recommend — is inherently non-decomposable. The experience of weighing options is not reducible to separate valuations of each option. The comparison itself is the experience.
Self-Model Salience
The last member of the entity-directed salience family measures how the self figures in the system’s own processing. But "figures how" hides a fork that must be made explicit, because two genuinely different quantities have been smuggled under one symbol, and they come apart on the framework’s own flagship example.
The first is causal self-salience, — the fraction of action entropy explained by the self-model component:
The second is attentional self-salience, — how prominently the self appears as an object in current processing, the degree to which the system is thematizing itself:
These dissociate, and flow is the proof. In flow the skilled self is driving everything — is high, the self-model is the controller — yet the self is not thematized at all; it has vanished from the field of attention, so is low. To call flow "low self-salience" full stop, as earlier formulations did, is true on the attentional reading and false on the causal one, and the contradiction was an artifact of running both quantities through a single . The split resolves it cleanly: flow is low , high ; shame is high on both (the self drives behavior and is the object of a harsh regard); depersonalization is low on both (the self neither drives nor is thematized); and the once-paradoxical states each land at distinct points. Where this part writes without subscript, read it as unless the surrounding formalism (mutual information with action) makes the causal reading explicit. Whether the two are coupled or free is itself diagnostic — it is governed by the coupling axis introduced later in this part: high binds attention and causation together, low permits the dissociation that flow exhibits.
High attentional self-salience: self-focused, self-conscious, self as primary object of attention. Low attentional self-salience: self-forgotten, absorbed in environment or task — which is fully compatible with the self-model still driving the behavior (high causal self-salience).
Though—"I" is not the secret interior that shame and secrecy create (though shame and secrecy can shape it). "I" is just the stable locus of integrated cause-effect structure that the world model has come to rely on most for its predictions—the component of that other components reference when computing expected futures. The self is a predictive structure, not a hidden essence. It is whatever the system has found to be its most reliable attractor for anticipating its own behavior. This is why depression feels like losing yourself: the world model can no longer reliably predict what "I" will do or want, so the self-reference breaks down and the system loses coherence. And it is why identity crises are not drama but dynamical events—the attractor that was "I" has destabilized, and the system must pay the expensive bill of finding a new one.