Turtle Trading Rules

Chapter 22 The Lies of the History Test

Chapter 22 The Lies of the History Test (3)
The reason why historical testing is said to have predictive value is because historical testing results can provide some indication of a trader's performance in the future.The closer the future is to the past, the closer future trading results will be to historical simulation results.As a method of system analysis, historical testing also has a big problem, that is, the future will never be equal to the past.But a system can indeed profit by exploiting the characteristics of eternal human behavior reflected in the market. From this perspective, the past is a simulation of the future, although not an exact simulation.The historical test results obtained with all optimized parameters represent very special trading results, that is, the trading results that the system using the optimal parameters will produce in past actual combat.Therefore, this simulation result represents the most optimistic historical review.

If the future is exactly the same as the past, you should get such a result in actual trading, but the future will never be equal to the past!Now take a look back at the graphs before and after this chapter: each graph is shaped like a mountain, and has a peak value.You can represent a given parameter value with a graph such as Figure 11-4.

If the value at point A represents a typical non-optimized parameter value, and the value at point B represents an optimized parameter value, then I would say: B value is better used in actual transactions, but if B value is used , the future actual trading results may be worse than the historical test results under the B value.

On the contrary, the A value is not as good as the B value in actual transactions, but its predictive value is higher, because if the A value is used for trading, the actual future results are more consistent with the historical test results under the A value—that is to say , it is equally likely that future actual results will be better or worse than historical test results.

Why?To understand this better, let us assume that the future changes so much that it is possible that the graph in Figure 11–4 will be shifted a little to the left or right, but we do not know whether to the left or the right.This represents the possible moving range of the relative position of the A and B values ​​in the future, which we call the margin of error.

For point A, if its relative position is moved to the left, its corresponding system performance will be lower than that of point A; if its position is moved to the right, the system performance will be improved.Therefore, the test results under the parameter value A have good predictive value, regardless of future changes, because it is equally likely to overestimate and underestimate future performance.

But point B is different.Whether you move to the left or right, the system performance will decrease.This means that the predicted results under the value of B are likely to overestimate the actual future results.If this effect is compounded across many different parameters, the impact of future changes is also compounded.This means that if many parameters are optimized, it will be difficult for future actual results to match those predicted at those optimal parameter values.

But it doesn't mean that we should use parameter A in real trading.Because the performance of the system around point B is still higher than the performance of the system around point A, even if the magnitude of future changes is considerable.Therefore, although the optimization process reduces the predictive value, you should still use the optimized parameters, because the optimized parameters are more likely to bring desirable results, regardless of future changes.

The optimization paradox has become a breeding ground for scams and tricks.There are plenty of unscrupulous system peddlers out there who brag about the super high profits and unbelievable performance they get from optimization (especially short-term optimization) in specific markets, but they know that such historical test results cannot be used in actual trading. realized in.However, just because optimization leads to hyperbole doesn't mean we shouldn't optimize.In fact, optimization is crucial to building an effective trading system.

overfitting or curve fitting

Scammers also use other methods to create unrealistic historical test results.The most daring individuals will intentionally use overfitting or curve fitting to beautify their systems.People often confuse overfitting with optimization paradox, but they are actually not the same thing.

Overfitting usually occurs when the system becomes too complex.Sometimes you can improve a system's historical performance by adding laws, but only because those laws affected a handful of significant trades.Addition laws can lead to overfitting, especially for trades that occur during critical periods.For example, if a rule requires you to exit a particularly large profitable position near the peak, this will of course improve your system performance, but if the rule does not have sufficient applicability to other situations, it becomes an oversimulation. combine.

I've seen many system touts use this to improve their system's performance after a period of relative sluggishness.They sometimes refer to the modified system as an "enhanced version" or "second generation" of the original system.If you want to buy this type of "enhanced" system, you should do a good job of researching the newly added laws to make sure that the improvements are not the result of overfitting.

I find that using extreme examples to illustrate a phenomenon often helps us understand it better.So, I want to give an example of extreme overfitting.I would start with a very simple dual moving average system, then add some rules and start overfitting the data.

We know that the system suffered a very severe decline during the last 6 months of the test period.So, I will add several new laws to improve the performance of the system by solving the fading problem.When the decline reaches a certain level, I reduce my position size to a certain percentage; after the decline period is over, I return the position to the normal level.

Let's add this new law to the system.This law has two parameters that need to be optimized: one is the percentage of position compression, and the other is the decline limit standard for starting position compression.According to the simulated equity curve, I decided to reduce the position by 38% when the decline rate reached 90%.Adding this rule greatly improved the performance of the system, the return increased from 41.4% without the rule to 45.7%, the maximum drawdown decreased from 56% to 39.2%, and the MAR ratio increased from 0.74 to 1.17.You may be thinking: "This is a great law, the system has improved a lot." In fact, you are completely wrong!

The problem is that this rule only works this time during the entire testing period.It happened near the end of the testing period, and I included this rule because I already knew the shape of the equity curve.Therefore, the system has been intentionally fitted to the data. "What's the big deal?" you might ask.So let's look at Figure 11–6, which is the MAR ratio for different fading limits.

It is clear that when we drop the fade limit below 37%, the performance of the system plummets.In fact, with the drawdown limit lowered by only 1%, the system abruptly went from a 45.7% annual gain to a 0.4% annual loss.What is the reason?It turns out that after lowering the decline limit, this rule will take effect in August 1996, causing us to drastically reduce the size of our positions, so that the profit in the later period is too small to recover from the decline.So the rule isn't that good.It worked in the first test only because that decline occurred towards the end of the test period, and the effect of position reduction on later performance was not reflected.

Small changes in parameter values ​​lead to drastic changes in trading results, a phenomenon that traders call a cliff.The appearance of a cliff is a good sign that you may have made the mistake of overfitting, and your actual trading results may be very different from the results in the test.The cliff phenomenon is also one of the reasons why we think parameter optimization is beneficial: through optimization procedures, you can find the cliff and fix it before you start trading.

Statistical value of sample size

As we said in , people tend to place too much emphasis on a few examples of a particular phenomenon, but they ignore an important fact: statistically speaking, we can't draw much conclusions from a few examples.This problem is the main cause of overfitting.Adding certain laws that don't work very often can lead to inadvertent overfitting, which can cause discrepancies between post hoc results and actual trading results.

This problem often occurs inadvertently, because most people don't think about the problem from this perspective.Seasonality is a good example.If you want to analyze a particular seasonal phenomenon with 10 years of data, you only have at most 10 examples available because your test period is only 10 years.Such sample sizes are of little statistical value, so any tests based on these data won't have much to say about future performance.

Suppose we ignore this problem and want the computer to help us find a perfect way to fit the data.You may find that there are several years of poor performance in September, so you want to add a rule to reduce the position to a certain percentage every September.Since you have a computer, you might want to use computer simulations to find out all the periods of seasonal distress, and each time you cut back on your positions.

I used this approach in the system described in this chapter.I ran a 4000 test to see the effect of seasonal adjustment: cut positions starting at the beginning of each month, cut by a certain percentage for a certain number of days, then restore the position to its original size after a certain number of days.As a result, I found two adjustable periods during the 10-year test period.If the position size is reduced by 9% during the first two days of September every year and the first 7 days of July every year, the performance of the system will improve, by how much?

After adding this rule, the return rate further increased from 45.7% to 58.2%; the decline rate increased slightly, from 39.2% to 39.4%; the MAR ratio increased from 1.17 to 1.48.At first glance, we still think: "This is a great law, the system has improved a lot."

Sadly, this law only works because of the severe downturns that have occurred in these past two seasonal periods, not because there is anything magical about them.It is unlikely that the same decline will occur again in the same period in the future.This is the worst kind of overfitting, but you have no idea how many smart people fall into this trap.

If you didn't know it, you might think that this system is very good and can be used for trading.You might even brag about this brilliant system to friends and family, trying to get some money from them.The problem is, the actual return on this system is only 41.4%, not 58.2%; the drawdown is 56.0%, not 39.4%; the MAR ratio is 0.74, not 1.48.The actual performance of the system is bound to disappoint you, because you have been confused by the beautiful picture under the curve fitting method.

Next, I'll talk about how to avoid the problems mentioned in this chapter.I'll show you how to determine the true potential performance of a system in order to minimize the influence of trader effects, how to identify random effects, how to optimize correctly, and how to avoid overfitting to historical data.

(End of this chapter)

Tap the screen to use advanced tools Tip: You can use left and right keyboard keys to browse between chapters.

You'll Also Like