The Jury Is Still Out on Factor Timing

There's some evidence that factor timing might yield a small benefit, but it's far from conclusive.

Alex Bryan Jan 2, 2019

A version of this article originally appeared in the October issue of ETFInvestor.

It is hard to successfully time any investment. Adjusting a portfolio based on expectations about the future can easily backfire because the future is hard to predict. Yet there is an emerging body of research that suggests it is possible to successfully time exposure to factors like value, momentum, small size, quality, and low volatility. While each of these factors has a good long-term record, they all go through cycles of underperformance. If timing really works, it could help mitigate this cyclicality, which is one of the biggest drawbacks to factor investing.

A healthy dose of skepticism is in order. Much of the research done thus far has come from practitioners, rather than academia, who work for asset managers with a vested interest in bringing new products to market. As with most financial research, data mining is also a risk because there are many variables researchers could have tested to find a predictive relationship that worked in sample but may not work out of sample. Even if there is a return benefit from factor-timing, implementing it reduces diversification relative to a static multifactor portfolio, which may outweigh the benefit. And it's important to bear in mind that even if a timing signal works on average, it won't always get the calls right. There is no pain-free way to beat the market. That said, factor-timing warrants serious review.

Factor-Timing Signals A recent paper from BlackRock suggests that there are four types of factor-timing signals that work: valuation, momentum, economic regime indicators, and dispersion.[1] The authors found that each of the four types of signals work well on their own and even better together. BlackRock does not currently have any factor-timing exchange-traded funds on the market, though it did launch a factor-timing model in September 2016 based on these insights.

The few shops that do offer factor-timing ETFs rely on indicators that broadly fit into one of these four categories. For instance, Oppenheimer Russell 1000 Dynamic Multifactor ETF OMFL relies on a blend of traditional economic and market sentiment indicators to gauge the economic regime and time its factor exposures accordingly. Global X Adaptive U.S. Factor ETF AUSF uses a contrarian performance signal, which is a type of value signal because assets that underperform tend to become cheaper and may be poised to do better in the future. PIMCO RAFI Dynamic Multi-Factor U.S. Equity ETF MFUS relies on momentum and contrarian (value) performance signals to time its exposures.

Let's take a closer look at each type of timing signal.

Economic Regime The idea that different factors tend to do better at different points in the business cycle is intuitive. BlackRock and Oppenheimer have both found that economic regime indicators were the strongest standalone predictors of factor performance in their back-tests. However, they use different metrics to define these periods and come to slightly different conclusions about when to overweight certain factors.

There are four stages in the business cycle: recovery, expansion, slowdown, and contraction. These are defined by whether the change in economic activity is positive or negative (Oppenheimer uses "above trend" or "below trend" instead) and whether it is accelerating or decelerating. Exhibit 1 summarizes the firms' findings about when each factor tends to outperform.

Both firms found that the small-size and value factors tended to do the best during recoveries. Smaller stocks tend to be more cyclical than their larger counterparts, as their higher market betas attest. This is likely because fewer of them enjoy durable competitive advantages to insulate their profits from fluctuations in the business cycle.

The relationship between value and the business cycle is less intuitive--and in my view, more suspect. Broad value indexes, like the Russell 1000 Value Index, have a similar market beta to the broad market, which suggests they are not more cyclical. However, deeper-value portfolios tend to have higher betas. A possible explanation for value stocks' observed cyclicality, which Andrew Ang of BlackRock posited, is that they have higher fixed costs and less flexibility than growth stocks, so their cash flows may be more sensitive to the business cycle. These stocks may also be more beaten-down than most during tough times and poised to outperform as conditions start to improve.

During expansions, as clearly defined trends emerge, momentum has been the best-performing factor (though Oppenheimer also found that small size and value continue to do well during those periods). Unsurprisingly, low volatility and quality have tended to do the best during slowdowns.

The biggest difference in the findings between the two firms is about which factors have tended to do the best during contractions. Oppenheimer found that quality and low volatility continued to outperform as expected, as well as momentum, which benefits from clear trends in the market. In contrast, BlackRock found that all factors modestly outperform during contractions, but momentum less than the others, which was a bit surprising. However, it's possible that market trends are less clear in contraction periods based on BlackRock's definition because it looks only at traditional economic data, while Oppenheimer pairs economic data with market sentiment data to get a better read of the business cycle.

It is also a little surprising that BlackRock found that value and size tended to outperform in both contraction and recovery periods, as these two regimes represent opposite sides of business cycle trends.

While the relationship between the business cycle and factor performance is interesting, there are good reasons to be skeptical. In hindsight, it's easy to identify each stage of the business cycles past, but it's hard to know where we stand in real time. And although the signals that BlackRock and Oppenheimer tested avoid look-ahead bias, they could have been cherry-picked to look good in sample. There are thousands of data points that could be reasonable indicators of the economic cycle. By chance alone, some of those data points will likely appear to be predictive of factor performance.

In their paper, "The Promises and Pitfalls of Factor Timing," a few researchers from State Street Global Advisors conducted an exercise to illustrate the dangers of data mining.[2] They looked at which signals were most predictive of factor performance from 1970 through 1990. They found that most of the signals with predictive power in sample were not predictive over the next 20 years out of sample.

It's also important to note that economic cycles are slow-moving, so there aren't many full cycles to look at in the back-tests to infer a robust relationship between the stage of the cycle and factor performance. And every cycle is different.

The world is a different place than it used to be. Business has become increasingly global. So, it probably isn't appropriate to look only at U.S. economic data. Even if there was a strong relationship between the U.S. business cycle in the past and factor performance, it may not be as strong now. That doesn't mean that business cycle factor-timing will fail, just that more evidence is needed to build confidence in its efficacy.

Valuations It is well-established that valuations can predict long-term asset returns (lower valuations are associated with higher future returns). This is true of asset classes, individual securities, and portfolios of securities. But that does not necessarily mean that valuations are an effective timing signal. For example, the U.S. stock market has been trading well above its historical average cyclically adjusted price/earnings ratio, or CAPE (based on data from 1880), since 2010. However, anyone who acted on that information--trimming or liquidating their U.S. stock allocation--probably regretted it, as the market delivered strong performance from January 2010 through August 2018 despite its seemingly high valuation.

If using valuations to time the market is hard, using them to time factors might be even harder, as Cliff Asness and his colleagues at AQR argue in their paper, "Contrarian Factor Timing is Deceptively Difficult."[3] That's because turnover in these portfolios reduces the predictive power of their valuations, as many of the current holdings may not stay in the portfolio long. Portfolio-level valuations are particularly unreliable for high-turnover strategies like momentum.

The relationship between valuations and factor performance is probably modest at best. In a previous article, I found that there was a moderate positive relationship between the valuation spread of the value and growth stocks, small- and large-cap stocks, and the performance of the value and small-cap factors over the next five years, based on data from June 1987 through August 2016. However, much of this effect can be attributed to extreme valuation spreads and subsequent reversals in 1999 and 2000. Additionally, the results were mixed for quality, and I did not find a significant relationship between the valuation spreads for the low-volatility and momentum factors over the market and their future performance.

Other approaches to valuation-timing can lead to different results. BlackRock found that valuation-timing worked by tilting toward factors that were trading the cheapest relative to their own history over the past three years. So, if quality was trading at a significantly lower valuation than in the recent past and value was only a little cheaper than normal, this timing strategy would favor quality. But the fact that the efficacy of valuation-timing depends on how it is defined suggests that it is not particularly robust. Ideally, we'd like to see similar results with different versions of the same idea because that suggests the metric presented wasn't cherry-picked, providing greater confidence that it may work out of sample.

In practice, the two factor-timing ETFs on the market that incorporate a valuation-timing component use contrarian performance signals rather than traditional value signals to time their factor tilts. The idea is that investors may overreact to a stretch of poor performance, giving up on styles after they have become cheap. Nobel-Prize-winner Richard Thaler and his colleague Werner De Bondt demonstrated that long-term performance reversals among stocks in their 1985 paper, "Does the Stock Market Overreact?"[4]

A similar effect seems to hold at the portfolio level. To test this, I developed a strategy that targets three of the five factor indexes with the worst performance over the previous five years, weights them equally, and rebalances once a year using data from November 2003 through August 2018. It modestly outperformed the static equal-weighted portfolio of the five indexes by 92 basis points annually. So, there may be something to this approach. It is less subject to data-mining risk than economic data and has been more extensively tested out of sample.

Momentum While relative performance tends to revert to the mean in the long term, it tends to persist in the short term. This short-term persistence, known as momentum, is found nearly everywhere in financial markets, just like value. Given its well-documented ability to predict short-term performance, I would expect it to be one of the more-promising candidates for use as a factor-timing signal. But while BlackRock found evidence that momentum-driven factor-timing works, I did not.

Using the same five factor indexes from the contrarian strategy, I tested a strategy that targets the three factor indexes with the best performance over the past 12 months from November 1999 through August 2018. That strategy lagged the static equal-weighted basket of factor indexes. The results were similar, just targeting the top-performing index as well as the two best performers. This suggests that momentum isn't a robust factor-timing signal on its own.

Dispersion The argument for using dispersion as a timing signal is that the return to each factor should be greater when there is greater separation among stocks in the starting universe on the metrics used to construct the factor portfolio. For example, if highly profitable stocks are more profitable than usual relative to stocks with weak profitability, the profitability/quality factor should do better.

BlackRock found that dispersion was the weakest standalone timing signal and that it worked better for value and quality than it did for other factors. The firm's study looked at tilting toward factors with the widest dispersion relative to their own history over the past three years, which had some predictive power.

To test the robustness of dispersion as a timing signal, I looked at how correlated each factor's returns were with the spread in the metric used to construct it, using annual data for the value, size, momentum, low volatility, and profitability portfolios from the French Data Library from 1964 through 2018. This is a less-sophisticated approach than the one BlackRock used, but if dispersion is a good predictor of performance, the results should directionally line up.

However, I found the expected relationship only for the value factor, and the correlation was only moderate, consistent with my findings for valuation-timing. There was virtually no correlation between the size, profitability, and past momentum spreads and the performance of those factors. And, contrary to expectations, the low-volatility factor did worse when past volatility spreads were wider. This suggests that dispersion is at best a weak factor-timing signal.

The Jury Is Still Out Given the complexity of factor-timing strategies, potential data-mining issues, limited research on this topic, and results that don't appear to be robust, more out-of-sample testing and live performance are necessary to build confidence in their efficacy. While it isn't prudent to write factor-timing off just yet, it's important to not to lose sight of one of the main goals of multifactor investing: diversification. Tilting toward certain factors at different times reduces diversification and can increase risk if the timing model gets the call wrong. There is also a risk that timing models that rely on valuations or momentum might effectively double down on those factors, potentially causing the portfolio to behave as if it had greater exposure to value or momentum stocks.

If factor-timing has any place at all in a portfolio, it will be important to keep factor tilts modest to maintain diversification. It is also probably best to diversify across multiple signals that tend to work to limit pain when they don't.

[1] Hodges, P., Hogan, K., Pederson, J., & Ang, A. 2016. "Factor Timing with Cross-Sectional and Time-Series Predictors." BlackRock. https:// www.blackrock.com/institutions/en-nl/literature/whitepaper/factor-timing-global-12-16.pdf

[2] Bender, J., Sun, X., Thomas, R., & Zdorovtsov, V. 2017. "The Promises and Pitfalls of Factor Timing." Univ. Pennsylvania, Wharton School of Business. https://jacobslevycenter.wharton.upenn.edu/wp-content/ uploads/2017/08/The-Promises-and-Pitfalls-of-Factor-Timing-1.pdf

[3] Asness, C.S., Chandra, S., Ilmanen, A., & Israel, R. 2017. "Contrarian Factor Timing is Deceptively Difficult." SSRN. https://papers.ssrn.com/ sol3/papers.cfm?abstract_id=2928945

[4] De Bondt, W.F.M., & Thaler, R. 1985. "Does the Stock Market Overreact?" J. Finance, Vol. 40, No 3, P. 793. http://breesefine7110.tulane.edu/wp-content/uploads/sites/110/2015/10/Debondt-and-Thaler.pdf

Disclosure: Morningstar, Inc. licenses indexes to financial institutions as the tracking indexes for investable products, such as exchange-traded funds, sponsored by the financial institution. The license fee for such use is paid by the sponsoring financial institution based mainly on the total assets of the investable product. Please click here for a list of investable products that track or have tracked a Morningstar index. Neither Morningstar, Inc. nor its investment management division markets, sells, or makes any representations regarding the advisability of investing in any investable product that tracks a Morningstar index.

The Jury Is Still Out on Factor Timing

More in ETFs

Active ETFs are Soaring. Should You Invest?

ARK’s Cathie Wood on Tesla, AI, and Bitcoin

3 Great Value ETFs for 2024

About the Author

Alex Bryan

Morningstar Target Value and Momentum Indexes

Morningstar Quarterly Style Monitor: Q2 2021

Morningstar Quarterly Style Monitor: Q1 2021

Set Realistic Expectations for Multifactor Funds

More Risk Doesn't Always Mean More Reward

Gold Is a Dull Investment

7 Questions to Ask Before Buying Any Multifactor Fund

These Funds' Yields Aren't Worth the Risks

Traditional Investments Can Weather Inflation

2 Income-Producing Funds We Like

Sponsor Center

How we make money

How we use your personal data

How we approach editorial content