Introductory PricingSingle Symbol $10 (was $19) · PRO $49 (was $99) · 14-day PRO trial

The Danger of Curve Fitting — When Optimisation Becomes a Trap

A strategy that has been perfectly shaped to the past is not a strategy. It's a description of history.

Curve fitting is one of the most common reasons systematic trading strategies fail in live markets despite looking compelling in backtesting.

It occurs when the parameters of a strategy are adjusted — deliberately or through repeated testing — to fit the historical data so closely that the model is no longer capturing a genuine pattern. It is capturing noise. And noise does not repeat.

What curve fitting actually looks like

Imagine testing a simple moving average crossover system. You run it with a 10/50 parameter combination and it produces modest results. You then test 10/30, then 10/40, then 20/50, and so on across dozens of combinations. Eventually you find a combination — say 13/47 — that produces excellent results. The equity curve is smooth, drawdowns are controlled, the metrics are strong.

The problem is that you have not discovered a robust strategy. You have discovered which parameters happened to work best on that particular slice of history. The 13/47 combination was selected precisely because it worked on that data. The fact that it worked is no longer evidence of anything predictive — it is merely evidence that you searched thoroughly enough.

This process, sometimes called data mining bias or over-optimisation, is the core of curve fitting. The more parameters you test, and the more combinations you try, the higher the probability that something will look good by chance alone.

Why curve-fitted strategies fail live

A curve-fitted strategy has been shaped to the specific characteristics of its training data. It has absorbed the particular sequences of volatility, trend, and noise that happened to occur in that period. When market conditions shift — as they inevitably do — the model's carefully tuned parameters no longer correspond to anything real.

The symptoms are recognisable: a strategy that backtests beautifully and then loses money almost immediately in live trading. Or one that performs well for a few weeks before degrading. The live market does not behave the way the historical data did, because no two periods are identical.

This is closely related to why backtests lie. Backtesting provides a necessary but insufficient filter for strategy quality. A bad backtest is definitely bad. But a good backtest, particularly one achieved through extensive parameter search, may be telling you very little about the future.

How to recognise over-optimisation

There are several practical signals that a strategy may be over-optimised.

The most obvious is a very narrow peak in performance. If a set of parameters works excellently but immediately degrades when you change any of them by a small amount, the strategy is likely curve-fitted. A genuine edge tends to be robust: changing the parameters slightly may affect performance modestly, but should not cause it to collapse.

Another signal is complexity without justification. Every additional parameter in a model is an additional degree of freedom — an additional way the model can be fitted to past data. A strategy with many parameters and a small trade sample is almost certainly over-fitted.

A strong robustness score in the context of darwintIQ captures something related: it measures the consistency and stability of model behaviour across the evaluation window, flagging results that appear too clean or irregular to be trustworthy.

How darwintIQ approaches the problem structurally

The architecture of darwintIQ is specifically designed to reduce the impact of curve fitting.

The Genetic Algorithm does not optimise parameters on historical data and then freeze them. Instead, it continuously evolves a population of models on a rolling 4-hour evaluation window. Models that perform well under current live conditions rise in ranking; those that do not are replaced.

This means no model is selected because it looked good on historical data that was optimised to fit it. Every model in the ranking earned its position by performing on the conditions that are actually occurring now.

The rolling evaluation window also means the system naturally discards models that benefited from conditions that no longer apply. A model that was well-suited to the volatility of last month but poorly suited to today's market behaviour will not maintain a high position. The population continuously adapts.

Final thoughts

Curve fitting is not a failure of effort — it is a consequence of how optimisation works. The more thoroughly you search for parameters that work on historical data, the more likely you are to find ones that work specifically on that data, and nowhere else. Recognising this, and building systems that evaluate strategies on conditions they were not optimised for, is the foundation of building models that actually hold up in practice.