The Setup

Train a classifier on cats. Then train it on dogs. By the time it learns dogs well, it has largely forgotten cats. This is catastrophic forgetting — the central problem in continual learning.

The standard engineering fix is to keep a replay buffer: store a sample of old training images and mix them in when learning new tasks. It works, but it is a workaround. You are fighting the symptom. This project asks the prior question: why does forgetting happen at all?

The Insight

When a network sees a cat, two things are happening simultaneously: it is encoding what makes a cat a cat and it is encoding the incidental visual context of that particular photo — the background, the lighting, the pose. These two causes of the image are independent, but the network conflates them. It builds a mental model of "cat" that includes the specific visual noise it happened to see during training.

When new tasks arrive and the weights shift, the noise-entangled prototypes drift. The class signal gets overwritten because it was never cleanly separated from the session-specific noise in the first place.

The fix follows directly from the causal structure: if class identity and visual nuisance are independent causes, then a prototype built by averaging over many visual conditions will be stable across sessions. The nuisance averages out. The class signal remains. No stored images needed.

A second consequence: because all classes share the same nuisance distribution, pooling covariance estimates across classes is not an approximation — it is the statistically correct thing to do under this causal model.

What Was Built

Five methods, each exploiting the causal structure at a different stage of the pipeline. The core classifier in all cases is Mahalanobis nearest-class-mean — simple, interpretable, and effective when the covariance is estimated correctly.

NSM-LW pools covariance across all seen classes with Oracle Approximating Shrinkage, preventing the covariance matrix from collapsing on small per-class samples. CMD-NSM projects features into the class-discriminative subspace before building prototypes, filtering out the nuisance dimensions entirely. CT-NECIL works at test time only — average multiple augmented views of a test image and the nuisance cancels out, with no changes to training.

NOR+ORTH regularizes backbone pretraining to be nuisance-orthogonal: features encoding class identity should not correlate with features encoding pose, background, or lighting. The right backbone makes every downstream method stronger.

What Happened

The no-exemplar methods outperformed replay-based baselines — often by a large margin. The best method beat DER++ (a strong replay baseline with 200 stored images) by over 30 percentage points on Split CIFAR-10, with zero stored examples.

One finding ran counter to expectation: explicit IRM regularization, which was the original theoretical motivation, hurt performance. With only four augmentation environments, IRMv1's gradient penalty is too noisy to be useful. The causal structure is better captured through architectural choices — pooling, subspace projection — than through direct invariance regularization with insufficient variation. The project ended with a cleaner, more honest understanding of where the theoretical motivation holds and where the engineering reality diverges.

Why It Matters

Replay buffers are a pragmatic fix that raise real questions: storage grows with every new task, raw image retention has privacy implications, and the rebalancing introduces bias. A method grounded in causal structure sidesteps all of these — not by ignoring the constraints, but by working with the actual structure of the problem. Forgetting is not a gradient instability to be patched. It is a consequence of mixing causes that should be kept separate.

Code and paper on GitHub →