The Calmar Ratio — Return Paid For in Drawdown
Sharpe punishes volatility. Sortino punishes downside. Calmar punishes the worst thing that ever happened.
The Calmar ratio measures a trading strategy's annualised return divided by its maximum drawdown. A Calmar ratio of 2.0 means the strategy earned twice as much per year, on average, as the deepest peak-to-trough loss it produced. It is one of the most direct ways of asking the question every trader eventually faces: was the return worth the pain required to extract it?
Inside darwintIQ, the Calmar ratio sits alongside Sharpe, Sortino, and the Stability Score as part of the wider performance profile, because each of those metrics punishes a different kind of risk and the Calmar punishes the one that tends to drive people out of strategies in real life: a deep, unexpected drawdown.
How the Calmar ratio is calculated
The formula is straightforward. Take the strategy's compound annual growth rate over a chosen period. Take the maximum drawdown observed during that same period — the largest percentage decline from any prior equity peak. Divide the first by the second.
A strategy that returned 30% annualised with a maximum drawdown of 15% has a Calmar ratio of 2.0. A strategy that returned 30% annualised with a maximum drawdown of 30% has a Calmar ratio of 1.0. The two strategies produced identical returns, but their Calmar ratios disagree by a factor of two — because the second one made the trader live through twice the damage to get there.
That difference is the entire point of the metric. Returns do not exist in isolation; they are paid for in the drawdown the strategy required. The Calmar ratio puts those two things on the same page.
Why drawdown is a different kind of risk
Most performance metrics — Sharpe, Sortino, profit factor — describe properties of the distribution of trades. They average across many observations and produce a number that summarises the strategy's behaviour in aggregate.
Drawdown is different. It describes the worst path the strategy walked, not the average one. A strategy that has a respectable Sharpe ratio and a 40% maximum drawdown is statistically reasonable and emotionally untradeable. Aggregate metrics struggle to express that gap; the Calmar ratio simply states it.
This matters because real money behaves very differently from simulated money. A drawdown that looks acceptable in a backtest, viewed as a single number, looks meaningfully larger when it is unfolding in front of you over several weeks. The Calmar ratio is, in a sense, a sanity check on whether the headline return came at a cost the trader was actually willing to absorb.
How the Calmar ratio compares with Sharpe and Sortino
The Sharpe ratio divides return by total volatility, treating upside and downside swings as equally undesirable. The Sortino ratio refines this by penalising only downside volatility — recognising that upside variance is not really a problem. Both are statistical descriptions of the return distribution.
The Calmar ratio steps outside the distribution entirely. It does not care how often the strategy was volatile. It cares about how far it fell at its worst. A strategy can have a strong Sharpe and Sortino and still produce a punishing maximum drawdown — usually because the bad moments, when they happened, were severe enough to dominate the path even though the average behaviour was tidy.
For darwintIQ users comparing models in the Trader Detail view, the three ratios together produce a much sharper picture than any one of them alone. Sharpe tells you about ordinary volatility. Sortino tells you about downside volatility. Calmar tells you about the worst moment. A model that scores well on all three is genuinely scarce — which is precisely why the combination is more informative than any single metric.
What a useful Calmar ratio threshold looks like
For futures, forex, and other leveraged strategies, a Calmar ratio above 1.0 is the rough minimum for a strategy to be worth running with real capital. Below 1.0, the strategy is earning less per year than the depth of the loss it can produce — a position that is hard to defend even when other metrics look acceptable.
A Calmar ratio between 1.0 and 2.0 is the workmanlike range — meaningful return relative to the risk taken, but nothing exceptional. Ratios above 2.0 are uncommon over long periods and almost always reflect strategies with strict drawdown control rather than aggressive return seeking. Ratios above 3.0, sustained across years, are rare enough to deserve scepticism until corroborated by walk-forward validation and consistent Stability Scores.
One caveat matters: Calmar is sensitive to the measurement window. A strategy that has been running for six months may have a flattering Calmar ratio simply because its worst drawdown has not yet happened. The longer the measurement window, the more meaningful the figure — and the more likely the maximum drawdown reflects something close to a realistic worst case.
Final thoughts
The Calmar ratio asks the right question — was the return worth the worst loss — and answers it in one number. That makes it especially useful for comparing trading strategies that look similar on average but behave very differently at their worst moments. Read it alongside Sharpe, Sortino, and Stability Score, and watch the measurement window: a Calmar ratio is only as honest as the period it was calculated over. Used carefully, it is one of the cleanest filters for separating trading models that are merely profitable from trading models that are profitable in a way you could actually live with.
Latest in Performance Metrics
- Profit Factor — What It Tells You About a Trading Strategy
- Skewness and Kurtosis in Trading Returns — Why Average Performance Lies
- Trading Expectancy: The Formula Every Model Should Pass
- Wasserstein Distance — What It Measures and Why darwintIQ Uses It
- Return Stability — Why Consistent Returns Matter More Than Total Return