Causal State-Space Models for Time Series
The Problem
A hospital trains a model to predict patient deterioration from vital signs. The model performs well — until the hospital introduces a new monitoring protocol. Predictions degrade. The model learned what happens when things proceed normally. It never learned why.
This is the limitation of correlation-based forecasting. The model picks up statistical patterns from historical data, but those patterns are entangled with a specific process. Change the process — through an intervention, a new treatment, a policy shift — and the correlation breaks down precisely when you need the model most.
The Insight
Forecasting and decision-making are different tasks. Forecasting asks: given how things usually go, what comes next? Decision-making asks: if I do X, what happens? Answering the second question requires knowing the causal structure — which variables cause which, and how the system responds to being changed from the outside.
A model that encodes the causal graph can simulate interventions directly: set a variable to a fixed value and propagate the consequences through the structure. A model without the graph cannot. It can extrapolate from what it has seen, but not from what it has never seen.
What Was Built
Causal structure integrated into modern sequence models. The idea: the transitions between hidden states should follow the causal graph, not just any learned mapping. The graph is estimated from data, then used to constrain how the latent state evolves — so forcing a variable to a new value propagates through the graph correctly, rather than spreading through unconstrained learned weights.
The model can then simulate counterfactuals that forecasting models cannot: hold everything constant, change one variable, ask what follows. That operation — intervening on the system rather than predicting its natural trajectory — is the one that actually matters for decisions.
What It's Being Tested On
Three settings chosen because each has a different causal challenge:
- Financial time series — observational data only; causal structure unknown but stable
- Clinical vitals (MIMIC-III) — real interventions with recorded outcomes
- Climate reanalysis (ERA5) — physical causal structure is known from first principles
The third setting is the hardest test: if the model recovers the causal structure that physics dictates, that is strong evidence it is identifying real structure — not statistical artifacts that happen to look like causes in the training data.
Ongoing. Causal discovery pipeline complete. Counterfactual evaluation in progress.