Founding pricing available nowPricing review on May 1Early customers keep their price for life

Wasserstein Distance — What It Measures and Why darwintIQ Uses It

When a model's return distribution shifts, something has changed. Wasserstein distance is one of the sharpest tools for detecting it.

What Wasserstein Distance Actually Measures

Wasserstein distance — also known as the Earth Mover's Distance — quantifies the difference between two probability distributions by measuring how much "work" is required to transform one into the other. The name comes from the analogy of moving earth: if you think of each distribution as a pile of soil, the Wasserstein distance is the minimum effort needed to reshape one pile into the other.

In trading model evaluation, this translates into a direct question: how different is this model's current return distribution from the reference distribution we're comparing it against? A low Wasserstein distance means the two distributions are similar — the model is behaving as expected. A rising Wasserstein distance is a signal that something has shifted.

Why Distribution Shape Matters More Than Averages

The average return of a trading model is a starting point, not a conclusion. Two models can have identical average returns but radically different risk profiles, depending on how those returns are distributed across individual trades.

A model that produces mostly small, consistent gains with occasional modest losses has a very different distribution shape from a model with large, erratic swings in both directions — even if their mean return is the same. The Wasserstein distance doesn't just compare averages; it compares the entire shape of the distribution. This makes it sensitive to changes that simpler metrics would miss: a model that starts producing more outlier losses, or whose return distribution starts skewing in an unexpected direction, will register a rising Wasserstein distance even if the mean return hasn't yet deteriorated.

This is what makes it valuable alongside metrics like Jensen–Shannon Divergence, which similarly detects distributional shift but uses a different mathematical approach. The two metrics are complementary — JSD is bounded and symmetric; Wasserstein distance is unbounded and gives a more intuitive sense of magnitude.

Wasserstein Distance in the Context of darwintIQ

In darwintIQ, Wasserstein distance appears alongside other distributional metrics including the KS Statistic, Mutual Information, and Population Stability Index. Together, these metrics form a layer of model evaluation that goes beyond headline performance numbers.

The core idea is that a model's robustness can only be assessed if you understand not just how much it has returned, but how consistently the underlying trade distribution is behaving. A model that passes the standard performance metrics — positive Expected Value, solid Profit Factor, reasonable Drawdown — but shows a high Wasserstein distance may be doing so through a distribution of returns that has shifted meaningfully from its earlier behaviour. That shift is worth investigating before treating the model as a reliable signal.

darwintIQ evaluates models over a rolling 4-hour window, meaning the distributional metrics are continuously recalculated as new trade data arrives. A model that consistently maintains a low Wasserstein distance is one whose behaviour is stable — its current trades look like its historical trades. This stability, captured across multiple metrics, is a core input into the Robustness Score.

How to Interpret a Rising Wasserstein Distance

A rising Wasserstein distance on its own is not a reason to discard a model — it's a reason to look more carefully. The first question is whether other distributional metrics are also rising. If the KS Statistic and Jensen–Shannon Divergence are elevated alongside Wasserstein distance, the evidence of distribution shift is stronger.

The second question is direction: is the distribution shifting toward better or worse outcomes? A model whose distribution is shifting toward a higher proportion of positive returns is behaving differently, but not necessarily worse. Context from the Drawdown, Sortino Ratio, and Stability Score helps to interpret whether the shift is meaningful or benign.

In practice, a model with a high Wasserstein distance and declining headline metrics is showing clear signs of degradation. A model with a high Wasserstein distance but stable headline metrics warrants monitoring but not immediate concern.

Final Thoughts

Wasserstein distance measures the gap between what a model used to do and what it's doing now. It's a sensitive, mathematically rigorous way to detect drift before it becomes obvious in simpler metrics. In darwintIQ's evaluation framework, it sits alongside other distributional statistics to give a richer, more complete picture of model behaviour than any single number can provide.

The models that stay at the top of the darwintIQ rankings over time tend to be those where not just the headline metrics hold up, but the distribution of trade outcomes remains coherent and stable — which is exactly what a low Wasserstein distance indicates.