Skip to Content
Rekenthaler Report

The Perils of Back-Testing

Are the findings accidental?

Hard Questions
The skeptics now have their skeptics.

The past 30 years has been the era of the doubters. Dozens of papers have documented cracks in the Efficient Markets Hypothesis. These range from single securities, such as 3Com's partial ownership of Palm receiving a higher market valuation in 2000 than 3Com did overall, to market segments, such as the higher performance enjoyed by value and momentum stocks, to entire marketplaces. Recent Nobel Laureate Robert Shiller, for example, argues that the U.S. stock market sometimes is irrationally priced.

Now the doubters are under attack. Market "anomalies" are easy enough to extract from past data if that is what one wishes to find, goes the current argument, but that doesn’t mean that they are investment opportunities. If they are accidents, then the accident is unlikely to repeat. If they deliver higher returns but at the cost of higher risk, then they are neither anomalies nor better investments. Finally, if they are indeed free lunches, presumably investors will gobble down those lunches, so that the opportunity disappears.

Similarly, market-prediction models are being prodded. It’s all well and good for Shiller to brandish a Cyclically Adjusted Price Earnings ratio, or CAPE ratio, diagram that appears to have predicted stock market behavior over the past several decades, writes the influential London Business School trio of Dimson, Marsh, and Staunton (DMS), but Shiller constructed that chart looking backward, using data that occurred after the fact. Applying the CAPE ratio using only information that was available at the time--which of course is what actual investors must do--would have yielded few useful predictions.

Recently, Jesse Livermore (about whom I know nothing, and that's probably not his real name either) of the blog Philosophical Economics (ditto, but he puts a lot of effort into that site, do check it out) put a different market-prediction model to the test. The model is investment-manager John Hussman’s chart of estimated future equity returns. The chart plots expected U.S. stock market results for the next 10 years. The intuition is straightforward: Take an estimated 6.3% of annual growth in company fundamentals, add the stock market’s current dividend yield, and then adjust for price by shaving the forecast if the market appears expensive and increasing it if the market seems cheap.

In spirit and mechanics, then, Hussman’s chart is similar to work done by other investment managers that attempt to forecast asset-class returns, for example, GMO. (Indeed, GMO periodically mentions Hussman’s research in its articles.) Attractively, the Hussman model does not rely on any particular approach to market valuation; feed it whatever data you like and it will spit out the forecast.

And apparently very accurately, too. Below is 65 years’ worth of results, plotting the results for seven market-valuation methods, the average of those methods, and the stock market’s actual future returns. As you can see, the methods track each other neatly, so that it doesn’t much matter which is selected (might as well take the average). And they land very reliably atop the actual performance, aside from a brief stretch in the late 1980s when stocks outgained expectations.


 

That looks pretty convincing to me and, I suspect, you as well. Livermore is not so easily sold. To start, he points out, this chart was created with nominal returns, unadjusted for inflation, as opposed to the norm of real returns. Livermore tests how the chart looks with real returns, and, sure enough, he finds that the predictions weaken. They’re still fairly accurate but not as strong as before. "That doesn't make sense," he writes. "The ability of a valuation metric to predict future returns should not be improved by the addition of noise [i.e. changing inflation rates]."

He then wonders about the three aspects of the model--dividend returns, the constant fundamentals return of 6.3%, and the adjustment for valuation. How do those differ over time from what actually occurred? That is, when the model's prediction diverges from reality, which component was the guilty party?

Or at least, that is where I thought that line of inquiry was headed. The actual claim is more subtle. Errors caused by incorrectly estimating dividend returns are modest. However, errors from the other two factors are large--but large in a good way. That is, when one zigs, the other zags. If actual fundamentals growth (Growth, the green line) is greater than 6.3% annually, which would give stocks a higher actual return than expected, then for the most part the contribution from valuations (Valuation, the purple line) is disappointing. And vice versa.


  - source: Philosophical Economics

Livermore argues that this behavior means that the model succeeds largely by accident. After all, with a 10-year time horizon and less than a century’s worth of data, it measures relatively few independent time periods. The number of major movements by the Growth and Valuation lines is even smaller. The purple Valuation line, for example, has essentially four regimes: negative from 1935-50; then positive for a decade; then negative until 1980; and then positive since that time.

Hussman, as one would expect, has little use for that claim, tweeting in response, "Why would anyone think that a positive 'error' in 10-year earnings growth would not predictably relate to a lower than expected ending P/E?" In other words, if companies enjoy stronger-than-normal fundamentals growth, the market correctly is skeptical and discounts stocks appropriately, expecting something of a reversion to the mean. If companies, on the other hand, have disappointing growth over a 10-year period, the market, also correctly, figures that the slump will not continue and values stocks more highly in the expectation of a rebound.

Hussman's contention is certainly correct. There's no question that investors attempt to price securities over a full economic cycle and will therefore mark down earnings that appear to be unsustainable, while marking up earnings (or losses) that occur during recessions. But things are not as simple as his response might make them seem, as evidenced by Livermore’s retort.

"If Hussman gets to choose, in hindsight, which time horizon to test his model on, he will choose the time horizon that 'just so' happens to produce the most attractive fit, which will be the time horizon that 'just so' happens to best align the errors to offset each other. But what does that establish, other than the obvious fact that there are coincidences to be found in almost any data set, for those with the luxury of hindsight to meticulously search for and find them? 

"To argue that the observed offset on the 10 year horizon is endemic to the way economies and markets function, such that they can be relied upon to continue to occur out into the future, he would need to give a reason why. Why should we expect the errors to offset each other on a 10-year horizon--but then not expect the same for a 5-year or 20-year or 30-year horizon? What makes 10 years special--different from 5 years or 20 years or 30 years? He would need to give a compelling answer. I doubt he can give one."

Livermore then runs the model using a 30-year time horizon, rather than the initial 10-year time horizon, and publishes that result plotted against the market’s actual 30-year gains. As you can see, when the time horizon is thus extended, the model no longer carries any predictive value.


  - source: Philosophical Economics

I offer this discussion not to bury Hussman’s model--after all, it’s possible that he can indeed offer convincing explanations for his use of the 10-year horizon and nominal returns--but rather as an illustration of the difficulty of interpreting back-tested schemes. Many choices go into their creation. How these choices steer the results is rarely known. Infrequently, if ever, will the creator of such models publish the sort of work that Livermore just conducted.

There have never been more back-tested approaches hitting the marketplace, based on the now-accepted viewpoint that the markets are incompletely efficient, containing cracks. In particularly, hordes of exchange-traded funds are being launched with promises of having discovered "anomalies" that they can exploit with their home-grown indexes, which boast strong performance records before they actually existed. They should be viewed with the same skepticism that Livermore brought to Hussman's model. Yes, the bow now rests neatly on the package, but we outsiders did not see how many attempts were made before the knot was tied. Caveat emptor. 

John Rekenthaler has been researching the fund industry since 1988. He is now a columnist for Morningstar.com and a member of Morningstar's investment research department. John is quick to point out that while Morningstar typically agrees with the views of the Rekenthaler Report, his views are his own.

Sponsor Center