Part I: Foundations

The World Model

Introduction
0:00 / 0:00

The World Model

In practice, maintaining the full belief state is computationally intractable for complex environments. Real systems maintain compressed representations.

A world model is a parameterized family of distributions Wθ=pθ(ot+1:t+Hht,at:t+H1)\worldmodel_\theta = {p_\theta(\mathbf{o}_{t+1:t+H} | \mathbf{h}_t, \mathbf{a}_{t:t+H-1})} that predicts future observations given history ht\mathbf{h}_t and planned actions, for some horizon HH.

Modern implementations in machine learning typically use recurrent latent state-space models:

Latent dynamics:pθ(zt+1zt,at)Observation model:pθ(otzt)Inference:qϕ(ztzt1,at1,ot)\begin{aligned}\text{Latent dynamics:} \quad & p_\theta(\latent_{t+1} | \latent_t, \mathbf{a}_t) \text{Observation model:} \quad & p_\theta(\mathbf{o}_t | \latent_t) \text{Inference:} \quad & q_\phi(\latent_t | \latent_{t-1}, \mathbf{a}_{t-1}, \mathbf{o}_t)\end{aligned}

The latent state zt\latent_t serves as a compressed belief state, and the model is trained to minimize prediction error:

Lworld=E[logpθ(otzt)+βKL[qϕ(zt)pθ(ztzt1,at1)]]\mathcal{L}_{\text{world}} = \E\left[ -\log p_\theta(\mathbf{o}_t | \latent_t) + \beta \cdot \KL\left[ q_\phi(\latent_t | \cdot) | p_\theta(\latent_t | \latent_{t-1}, \mathbf{a}_{t-1}) \right] \right]

The world model is not an optional add-on. It is the minimal object that makes coherent control possible under uncertainty. Any system that regulates effectively under partial observability has a world model, whether explicit or implicit.

World Models in AI

The theoretical necessity of world models is now being realized in artificial systems:

  • Dreamer (Hafner et al., 2020): Learns latent dynamics model, plans in imagination
  • MuZero (Schrittwieser et al., 2020): Learns abstract dynamics without reconstructing observations
  • JEPA (LeCun, 2022): Joint embedding predictive architecture for representation learning

These systems demonstrate that the world model structure I derive theoretically is also what emerges when building capable artificial agents. The convergence is not coincidental—it reflects the mathematical structure of the control-under-uncertainty problem.