The World Model
In practice, maintaining the full belief state is computationally intractable for complex environments. Real systems maintain compressed representations.
A world model is a parameterized family of distributions Wθ=pθ(ot+1:t+H∣ht,at:t+H−1) that predicts future observations given history ht and planned actions, for some horizon H.
Modern implementations in machine learning typically use recurrent latent state-space models:
Latent dynamics:pθ(zt+1∣zt,at)Observation model:pθ(ot∣zt)Inference:qϕ(zt∣zt−1,at−1,ot) The latent state zt serves as a compressed belief state, and the model is trained to minimize prediction error:
Lworld=E[−logpθ(ot∣zt)+β⋅KL[qϕ(zt∣⋅)∣pθ(zt∣zt−1,at−1)]] The world model is not an optional add-on. It is the minimal object that makes coherent control possible under uncertainty. Any system that regulates effectively under partial observability has a world model, whether explicit or implicit.