Turtle Trading Rules
Chapter 25 The Statistical Basis of History Testing
Chapter 25 The Statistical Basis of History Testing (3)
It is a good habit to experience the effect of parameters before deciding to adopt a system. I call it a parameter tuning test.Pick out a few system parameters, adjust the parameter values by a large amount, such as 20%~25%, and then see how the effect is.Taking the optimization curves in Figure 11-2 and Figure 11-3 as an example, you can adjust the parameter values far away from the optimal point.For this Bollinger Band system, I wanted to see what it would be like to change the optimal exit criteria of 350 days and -0.8 to 250 days and zero.As a result, the adjustment of the parameters changed the RAR from 59% to 58%, and the R cube changed from 3.67 to 2.18, which is quite a significant change.When you move from historical data testing to actual combat in the market, you are likely to see such a dramatic change.
rolling optimization window
There is another method that can help you directly experience the transition from virtual testing to real trading, and that is the rolling optimization window.Pick a random day 8-10 years ago and optimize with all the data up to that day - use your usual optimization methods, make the trade-offs you would normally make, as if you only had data up to that day data.Once you've arrived at the "optimum" parameter values, test those parameter values with data from two years after that date.How has the system performed over the past two years?
Next, postpone the end point of the test for two years (that is, one day 6 to 8 years ago), and test again.What has changed this time compared to the last test and the last rolling window?How is it different this time compared to your original parameter values, which are optimal values calculated using all available data?Continue backwards, repeating the process until today.
I used this method to optimize the Bollinger Bands system.During the test, I conducted a large-scale adjustment test on the values of the three parameters, and then selected the optimal value based on the optimal position (generally, it is close to the point where the R cube value reaches the maximum).I did 5 separate 10-year inspections.
1989~1998280天1.8–0.855.0%58.5%6.3%7.345.60–23.7%1991~2000280天1.8–0.558.5%58.8%0.6%5.605.32–5.0%1993~2002260天1.7–0.758.5%59.3%1.4%7.683.94–5.0%1995~2004290天1.7–0.663.9%57.7%–8.3%5.533.90–29.5%1997~2006290天1.7–0.655.1%N/A N/A 3.90N/A N/A可以看到,在每一个滚动期中,实际表现都与测试值大相径庭。另外,不同滚动期的最优值也不尽相同。这证明了测试结果的不精确性,也反映了从虚拟测试转向实践交易时的不确定性。
Monte Carlo test
Monte Carlo testing is a method of judging the robustness of a system, which can answer questions like: What would happen if the history was slightly changed?What will happen in the future?With Monte Carlo testing, you can use a series of events representing historical actual data to generate another slightly different picture.
Some methods use random data to test a specific phenomenon, and Monte Carlo test is a general term for such methods.This test is most useful for phenomena that are impossible or difficult to describe precisely mathematically.The word Monte Carlo comes from the city of Monaco, which is famous for its gambling industry, because the casinos in Monte Carlo have many games whose results are determined by random events: such as roulette, craps, blackjack, etc.The same scientists who developed the atomic bomb used this method in the Manhattan Project, and it gets its name from that period.
These scientists had to determine the fission signature of uranium to know how much uranium was needed to make an atomic bomb.Because enriched uranium is extremely expensive, they cannot afford to be wrong in judgment.If the atomic bomb failed to detonate because there was too little uranium, they would have wasted months of time, let alone money.Likewise, if they overestimated the amount of uranium they would have wasted months of testing time.Unfortunately, the interaction of uranium atoms inside the bomb was too complex to be modeled accurately with the methods of the time.Computers can do this task, but there were no computers like today at that time.
An atomic fission releases a large number of neutrons, and a certain proportion of these neutrons can trigger another atomic fission.To determine the necessary amount of fissionable uranium, scientists must know what this ratio is.The famous physicist Richard Feynman came up with a solution: Have a team of mathematicians study the characteristics of a neutron in an interaction to determine whether the neutron is absorbed by another nucleus , still splitting another atom.Feynman realized that they could use random numbers to represent the various neutrons released when atoms split.After thousands of tests, they can see the exact distribution of uranium's fission signature and determine the necessary amount of uranium.Feynman knew that although he couldn't predict the future because the whole process was too complicated, he could at least grasp the main aspects of the problem from a familiar perspective, and obtain the answer to the whole problem by simulating the properties of neutrons with random numbers.In this way, he could grasp the nature of the fission characteristic of uranium without having to predict exactly the motion of every atom at every point.
different scene
The market is even more complicated than a nuclear fission reaction.The market is made up of thousands of people, each of whom will make decisions based on their own experience and judgment, and these decisions are harder to predict than the motion characteristics of neutrons.Fortunately, just as Feynman used random numbers to analyze uranium, we can also use random numbers to better understand the underlying characteristics of a trading system, even if we cannot foresee the future.What would history look like if the past was slightly changed?We can test this alternative scenario with a Monte Carlo test.
To use Monte Carlo tests to generate alternative scenarios, we have two common methods available:
Trading adjustment: Randomly change the trading order and start date in the actual simulation results, and then adjust the net asset value with the adjusted trading order and the profit and loss level of these transactions.
Net worth curve adjustment: Randomly select some parts in the initial net worth curve and combine them into a new net worth curve.
Among the two methods, the alternative equity curve generated by equity curve adjustment is more realistic, because the Monte Carlo test of randomly changing trading orders can easily underestimate the possibility of fading.
The biggest drawdowns always occur at the end of a major trend or when the asset is trending upwards.Because at these times, the correlation between markets is higher than usual.This is true for both futures and stock markets.When the big trends come to an end and crash and reverse, it seems like everything starts to go against you, and even markets that normally seem uncorrelated start to correlate with each other in these choppy days.
Because the transaction adjustment method removes the correlation between transactions and dates, it also removes the adverse effect on the equity curve of multiple simultaneous reversal transactions.This means that the degree and frequency of fading in the Monte Carlo test is lower than in reality.Take the gold and silver movements in the spring of 2006 as an example.If you are examining a trend-following system that is involved in both markets, the trade adjustment means that your drawdown losses in the two markets will occur at different times, which is equivalent to moderating the drawdown in each market. .In fact, the effect also extends to several other relatively unexpected markets, such as sugar.Like gold and silver, the sugar market suffered a severe decline in the 2006-day period from mid-May to mid-June 5.Therefore, transaction adjustment is not advisable, because it underestimates the decline level of the medium and long-term system in actual transactions.
The stock market crash of 1987 is also an example of this phenomenon.On the day that the Eurodollar gapped sharply higher, many markets that are not normally correlated also gapped sharply together, costing me a lot.A trade-adjusted Monte Carlo test tends to downplay this very real event, since it spreads the trade dates such that their unfavorable shifts no longer occur on the same day.
Many software with Monte Carlo testing function can generate new curves by adjusting the equity curve, but they do not take into account another important problem.According to my test and practical experience, I found that the time and degree of decline at the end of the general trend are far beyond the results of random simulation.During these big downturns, the equity curves of trend-following systems exhibit serial correlation—that is, changes in assets today are correlated with changes in assets the day before.Put more simply, bad days tend to come in clusters and one after another, which is not characteristic of random events.
Still using the example of the gold, silver and sugar markets in the spring of 2006, if you adjusted for just the daily change in NAV, the string of sharp moves from mid-May to mid-June would disappear, because if you just randomly It is unlikely that such a huge change will explode in a concentrated manner by systematically drawing data from the probability distribution curve or even the real net worth curve.
Considering this problem, our company's simulation software also allows random interception of the entire curve of multiple trading days when adjusting the net worth curve, not just the data of a single trading day.In this way, the simulated net worth curve will retain the concentrated adverse changes and faithfully reflect the actual trading conditions.In the test, I used the adjustment method of intercepting the entire curve of the 20th day. I found that this method can preserve the automatic serial correlation of the net worth curve, making the simulation results more realistic and predictive.
Different Net Worth Curve
How can we use them when we have simulated other equity curves with Monte Carlo tests?In fact, we can use these new curves to create a performance profile for a particular metric.If future conditions bear any resemblance to the alternative scenarios we generated in our simulations, future performance potential is reflected in this distribution.Figure 12–3 is such a distribution.We simulated and generated 2000 different net worth curves, calculated the RAR of each curve, and then drew the distribution of the results on the graph.
It can be seen in the figure that there is a vertical line crossing the curve in the upper part of the graph, and the RAR corresponding to it is the RAR value with a confidence level of 90%: that is, 90% of the RARs in all simulated curves are higher than this value.For this example, 2000% of the 90 simulations yielded a RAR greater than 42%.
A graph like this is very useful because you can see that the future is not certain and that there are many possibilities.Don't delve too deeply into the minutiae of such reports, though.Don't forget that these data come from simulated equity curves, and equity curves are based on historical data, which naturally cannot get rid of all the potential pitfalls mentioned.If the original test is bad, the Monte Carlo test will not be the savior, because it itself is derived from the original test and cannot be detached from the original data.If the optimization contradiction causes RAR to be overestimated by 20%, then the alternative equity curve simulated by the Monte Carlo test will also overestimate RAR by 20%, because it uses the same optimal parameter value.
Taken together, historical testing is at best a rough estimate of future trends.Robust indicators are more predictive of future performance than more sensitive indicators, but they are still far from precise.If someone claims that you're guaranteed to get a certain level of return, that person is either lying or an amateur; if the person is trying to sell you something, I strongly suspect he falls into the former category.
The next chapter will introduce some protective trading methods.Using these methods, your trading will be more robust, which means you will be less prone to wild ups and downs.
(End of this chapter)
It is a good habit to experience the effect of parameters before deciding to adopt a system. I call it a parameter tuning test.Pick out a few system parameters, adjust the parameter values by a large amount, such as 20%~25%, and then see how the effect is.Taking the optimization curves in Figure 11-2 and Figure 11-3 as an example, you can adjust the parameter values far away from the optimal point.For this Bollinger Band system, I wanted to see what it would be like to change the optimal exit criteria of 350 days and -0.8 to 250 days and zero.As a result, the adjustment of the parameters changed the RAR from 59% to 58%, and the R cube changed from 3.67 to 2.18, which is quite a significant change.When you move from historical data testing to actual combat in the market, you are likely to see such a dramatic change.
rolling optimization window
There is another method that can help you directly experience the transition from virtual testing to real trading, and that is the rolling optimization window.Pick a random day 8-10 years ago and optimize with all the data up to that day - use your usual optimization methods, make the trade-offs you would normally make, as if you only had data up to that day data.Once you've arrived at the "optimum" parameter values, test those parameter values with data from two years after that date.How has the system performed over the past two years?
Next, postpone the end point of the test for two years (that is, one day 6 to 8 years ago), and test again.What has changed this time compared to the last test and the last rolling window?How is it different this time compared to your original parameter values, which are optimal values calculated using all available data?Continue backwards, repeating the process until today.
I used this method to optimize the Bollinger Bands system.During the test, I conducted a large-scale adjustment test on the values of the three parameters, and then selected the optimal value based on the optimal position (generally, it is close to the point where the R cube value reaches the maximum).I did 5 separate 10-year inspections.
1989~1998280天1.8–0.855.0%58.5%6.3%7.345.60–23.7%1991~2000280天1.8–0.558.5%58.8%0.6%5.605.32–5.0%1993~2002260天1.7–0.758.5%59.3%1.4%7.683.94–5.0%1995~2004290天1.7–0.663.9%57.7%–8.3%5.533.90–29.5%1997~2006290天1.7–0.655.1%N/A N/A 3.90N/A N/A可以看到,在每一个滚动期中,实际表现都与测试值大相径庭。另外,不同滚动期的最优值也不尽相同。这证明了测试结果的不精确性,也反映了从虚拟测试转向实践交易时的不确定性。
Monte Carlo test
Monte Carlo testing is a method of judging the robustness of a system, which can answer questions like: What would happen if the history was slightly changed?What will happen in the future?With Monte Carlo testing, you can use a series of events representing historical actual data to generate another slightly different picture.
Some methods use random data to test a specific phenomenon, and Monte Carlo test is a general term for such methods.This test is most useful for phenomena that are impossible or difficult to describe precisely mathematically.The word Monte Carlo comes from the city of Monaco, which is famous for its gambling industry, because the casinos in Monte Carlo have many games whose results are determined by random events: such as roulette, craps, blackjack, etc.The same scientists who developed the atomic bomb used this method in the Manhattan Project, and it gets its name from that period.
These scientists had to determine the fission signature of uranium to know how much uranium was needed to make an atomic bomb.Because enriched uranium is extremely expensive, they cannot afford to be wrong in judgment.If the atomic bomb failed to detonate because there was too little uranium, they would have wasted months of time, let alone money.Likewise, if they overestimated the amount of uranium they would have wasted months of testing time.Unfortunately, the interaction of uranium atoms inside the bomb was too complex to be modeled accurately with the methods of the time.Computers can do this task, but there were no computers like today at that time.
An atomic fission releases a large number of neutrons, and a certain proportion of these neutrons can trigger another atomic fission.To determine the necessary amount of fissionable uranium, scientists must know what this ratio is.The famous physicist Richard Feynman came up with a solution: Have a team of mathematicians study the characteristics of a neutron in an interaction to determine whether the neutron is absorbed by another nucleus , still splitting another atom.Feynman realized that they could use random numbers to represent the various neutrons released when atoms split.After thousands of tests, they can see the exact distribution of uranium's fission signature and determine the necessary amount of uranium.Feynman knew that although he couldn't predict the future because the whole process was too complicated, he could at least grasp the main aspects of the problem from a familiar perspective, and obtain the answer to the whole problem by simulating the properties of neutrons with random numbers.In this way, he could grasp the nature of the fission characteristic of uranium without having to predict exactly the motion of every atom at every point.
different scene
The market is even more complicated than a nuclear fission reaction.The market is made up of thousands of people, each of whom will make decisions based on their own experience and judgment, and these decisions are harder to predict than the motion characteristics of neutrons.Fortunately, just as Feynman used random numbers to analyze uranium, we can also use random numbers to better understand the underlying characteristics of a trading system, even if we cannot foresee the future.What would history look like if the past was slightly changed?We can test this alternative scenario with a Monte Carlo test.
To use Monte Carlo tests to generate alternative scenarios, we have two common methods available:
Trading adjustment: Randomly change the trading order and start date in the actual simulation results, and then adjust the net asset value with the adjusted trading order and the profit and loss level of these transactions.
Net worth curve adjustment: Randomly select some parts in the initial net worth curve and combine them into a new net worth curve.
Among the two methods, the alternative equity curve generated by equity curve adjustment is more realistic, because the Monte Carlo test of randomly changing trading orders can easily underestimate the possibility of fading.
The biggest drawdowns always occur at the end of a major trend or when the asset is trending upwards.Because at these times, the correlation between markets is higher than usual.This is true for both futures and stock markets.When the big trends come to an end and crash and reverse, it seems like everything starts to go against you, and even markets that normally seem uncorrelated start to correlate with each other in these choppy days.
Because the transaction adjustment method removes the correlation between transactions and dates, it also removes the adverse effect on the equity curve of multiple simultaneous reversal transactions.This means that the degree and frequency of fading in the Monte Carlo test is lower than in reality.Take the gold and silver movements in the spring of 2006 as an example.If you are examining a trend-following system that is involved in both markets, the trade adjustment means that your drawdown losses in the two markets will occur at different times, which is equivalent to moderating the drawdown in each market. .In fact, the effect also extends to several other relatively unexpected markets, such as sugar.Like gold and silver, the sugar market suffered a severe decline in the 2006-day period from mid-May to mid-June 5.Therefore, transaction adjustment is not advisable, because it underestimates the decline level of the medium and long-term system in actual transactions.
The stock market crash of 1987 is also an example of this phenomenon.On the day that the Eurodollar gapped sharply higher, many markets that are not normally correlated also gapped sharply together, costing me a lot.A trade-adjusted Monte Carlo test tends to downplay this very real event, since it spreads the trade dates such that their unfavorable shifts no longer occur on the same day.
Many software with Monte Carlo testing function can generate new curves by adjusting the equity curve, but they do not take into account another important problem.According to my test and practical experience, I found that the time and degree of decline at the end of the general trend are far beyond the results of random simulation.During these big downturns, the equity curves of trend-following systems exhibit serial correlation—that is, changes in assets today are correlated with changes in assets the day before.Put more simply, bad days tend to come in clusters and one after another, which is not characteristic of random events.
Still using the example of the gold, silver and sugar markets in the spring of 2006, if you adjusted for just the daily change in NAV, the string of sharp moves from mid-May to mid-June would disappear, because if you just randomly It is unlikely that such a huge change will explode in a concentrated manner by systematically drawing data from the probability distribution curve or even the real net worth curve.
Considering this problem, our company's simulation software also allows random interception of the entire curve of multiple trading days when adjusting the net worth curve, not just the data of a single trading day.In this way, the simulated net worth curve will retain the concentrated adverse changes and faithfully reflect the actual trading conditions.In the test, I used the adjustment method of intercepting the entire curve of the 20th day. I found that this method can preserve the automatic serial correlation of the net worth curve, making the simulation results more realistic and predictive.
Different Net Worth Curve
How can we use them when we have simulated other equity curves with Monte Carlo tests?In fact, we can use these new curves to create a performance profile for a particular metric.If future conditions bear any resemblance to the alternative scenarios we generated in our simulations, future performance potential is reflected in this distribution.Figure 12–3 is such a distribution.We simulated and generated 2000 different net worth curves, calculated the RAR of each curve, and then drew the distribution of the results on the graph.
It can be seen in the figure that there is a vertical line crossing the curve in the upper part of the graph, and the RAR corresponding to it is the RAR value with a confidence level of 90%: that is, 90% of the RARs in all simulated curves are higher than this value.For this example, 2000% of the 90 simulations yielded a RAR greater than 42%.
A graph like this is very useful because you can see that the future is not certain and that there are many possibilities.Don't delve too deeply into the minutiae of such reports, though.Don't forget that these data come from simulated equity curves, and equity curves are based on historical data, which naturally cannot get rid of all the potential pitfalls mentioned.If the original test is bad, the Monte Carlo test will not be the savior, because it itself is derived from the original test and cannot be detached from the original data.If the optimization contradiction causes RAR to be overestimated by 20%, then the alternative equity curve simulated by the Monte Carlo test will also overestimate RAR by 20%, because it uses the same optimal parameter value.
Taken together, historical testing is at best a rough estimate of future trends.Robust indicators are more predictive of future performance than more sensitive indicators, but they are still far from precise.If someone claims that you're guaranteed to get a certain level of return, that person is either lying or an amateur; if the person is trying to sell you something, I strongly suspect he falls into the former category.
The next chapter will introduce some protective trading methods.Using these methods, your trading will be more robust, which means you will be less prone to wild ups and downs.
(End of this chapter)
You'll Also Like
-
After Entering the Book, She Became Rich in the 1980s
Chapter 441 4 hours ago -
My singer girlfriend is super fierce
Chapter 1294 6 hours ago -
After waking up from a thousand years of sleep, the 749 Bureau came to the door
Chapter 130 6 hours ago -
Three Kingdoms: Plundering Entries, From Merchants to Emperors
Chapter 79 20 hours ago -
Bad man, the system crashed.
Chapter 349 20 hours ago -
Plants vs. Cultivation
Chapter 245 1 days ago -
The Psychic Resurrection: Riding the Mirage
Chapter 328 1 days ago -
The Lucky Wife of the Era Married a Rough Man With Space
Chapter 585 1 days ago -
Eagle Byzantium
Chapter 1357 1 days ago -
With full level of enlightenment, I turned the lower world into a fairyland
Chapter 170 1 days ago