Skip to Content
ETF Specialist

Performance Evaluation Tool Kit

The appropriate tools may help investors distinguish luck from skill.

Most investors pay lip service to the idea that past performance is not indicative of future results. But it's tempting to use past performance to gauge manager skill and form expectations for the future. Unfortunately, raw performance doesn't say much about skill. An unskilled manager can outperform if he is lucky (or vice versa) or takes more risk to boost returns, which may not continue to pay off in the future. While raw performance does not tell the whole story, it is possible to uncover useful information from past performance with the appropriate tools.

Selecting the appropriate benchmark is one of the most crucial steps of performance evaluation. The benchmark should be transparent, investable, and representative of the fund's investment style. The most appropriate benchmark is not necessarily the one listed on the fund's prospectus. For example,  Dodge & Cox Income's (DODIX) primary prospectus benchmark, the Barclays U.S. Aggregate Bond Index, skews more heavily toward government bonds and is not truly representative of the fund's investment style as a result. In fact, the return pattern of the Barclays U.S. Credit Index more closely fits the fund's over the past decade. Because this fund is taking more credit risk than the Barclays Aggregate Bond Index, it should earn higher returns as compensation, but that does not mean the manager is doing a good job. It is easy to take more credit risk at lower cost through a corporate-bond index fund. While it may not be possible to find a perfect benchmark, it is important to identify and control for differences in risk between the benchmark and the fund.

That still leaves the challenge of disentangling luck from skill. Because a broad index is simply the weighted average of all active investors' bets, it might be reasonable to expect close to half of all managers to outperform a representative benchmark by chance in any given year. Over longer periods, fewer managers should outperform by luck alone. But even a long record of outperformance is not sufficient evidence of skill. A fund's 10-year performance record could look great due to a handful of good--or lucky--calls in a concentrated period. For example,  Parnassus Core Equity Investor's (PRBLX) 10.4% return over the trailing 10 years through August 2014 looks pretty good compared with the S&P 500's 8.4% return. But most of that superior performance was concentrated in 2008, owing to the fund's limited exposure to the financial-services industry, better stock selection in that sector, and meaningful cash balance. That positioning may have been the result of shrewd management, or merely luck.

The more consistent a fund's outperformance is, the less likely that it is due to luck. A relative wealth chart offers an effective way to gauge consistency. It shows the timing and magnitude of a fund's outperformance relative to the benchmark. An investor can create such a chart by dividing the growth of $1 invested in a fund by the growth of $1 invested in its benchmark, plotted against time. When the line is upward sloping, the fund is outperforming, and when it is downward sloping, it is underperforming. It should trend upward over time for consistent outperformers.  

The chart below clearly illustrates that nearly all of Parnassus Core Equity's outperformance relative to the S&P 500 over the past decade occurred from October 2007 through February 2009, the period leading up to and including the financial crisis.

Factor Analysis
Consistent performance is a good sign, but there are plenty of good managers who may lag for years because it can take a long time for their investment theses to play out. It may be more useful to study the underlying drivers of a fund's performance to more accurately assess skill and understand how a fund will likely behave in the future. A fund's returns can be broken into its component parts with a powerful tool known as factor analysis.

Let's start with the basics. A small portion of every fund's returns is simply compensation for the time value of money, which investors can approximate with the risk-free rate or return on short-term Treasuries. We subtract this risk-free rate from the fund's returns to account for this source of return. Its sensitivity to the market risk premium (the return on a broad index portfolio less the risk-free rate), can usually explain most of the rest. For example, there is a fairly strong relationship between the market risk premium and the returns of Parnassus Core Equity over Treasuries, as the chart below illustrates.

Drawing a best-fit line through the data allows us to estimate the fund's sensitivity to the market risk premium (the slope of the line), and whether it earned any returns above what we would expect for the level of market risk it took (the intercept). This is known as a single factor regression. The table below shows an abbreviated regression output from Microsoft Excel.

In this case, the slope of the line (also called beta) was 0.83. This means that the fund increased in value 0.83% for each 1% increase in the value of the market and declined by 0.83% for each 1% decline in the value of the market. In other words, it is taking less risk. In this light, it is impressive that the fund was able to keep up with the market during the better times. This simple model attributes this feat to skill, which we infer from the positive intercept. In financial speak, this is called alpha. The p-values next to the coefficients tell us how likely that those values are different than zero due to chance. For example, the 0.05 p-value next to the intercept indicates that there is a 5% chance that this manager is actually not skilled. Generally, any coefficient with a p-value of 0.05 or less is worth paying attention to. The Adjusted R-squared indicates how well the model fit the data. This simple model explained 91% of the variance of the fund's returns.   

While the market risk premium is usually the most important factor, a few investments styles can also help explain asset returns. For instance, value managers may outperform the market when value stocks are in favor, even if they are not skilled. Investors can get exposure to value stocks more cheaply through an index fund, like  Vanguard Value ETF (VTV). It does not make sense to give a manager credit for the returns to investment styles that investors can replicate through mechanical rules. Historically, value, small-cap, momentum, and quality (companies with high and stable profits) stocks have outpaced their counterparts with the opposite characteristics over the long run. In order to study these effects, researchers have constructed portfolios that go long stocks with each of these characteristics and short stocks with the opposing characteristics. We can add these as additional factors to the simple model described above. This allows us to create a custom best-fit benchmark that controls for a fund's style tilts. The table below illustrates the results of this multifactor regression for Parnassus Core Equity.

Controlling for its exposure to quality, small-cap, value, and momentum stocks, Parnassus Core Equity has a market beta of 0.91, which indicates that it is still taking less market risk than average. The fund did not have appreciable exposure to the small-cap (SMB), value (HML), or momentum (UMD) factors, as the p-values next to these coefficients are greater than 0.05. However, it did have meaningful exposure to quality stocks (QMJ). Holding everything else constant, the fund tended to increase in value by 0.20% when high quality stocks outperformed their lower quality counterparts by 1%. But investors who simply want exposure to quality stocks could get it more cheaply and consistently through iShares MSCI USA Quality Factor (QUAL).

This quality tilt explained away some of the manager's outperformance. The intercept--the monthly return attributable to skill--has declined from 0.21% in the single factor model to 0.12%, and it is no longer statistically significant. There is a 24% chance that this "skill" was no more than luck. It is of course still economically significant, representing outperformance of 1.4% annualized.

Evidence of skill and style tilts can change over time. A manager who was able to skillfully avert the worst of the financial crisis may or may not be able to do the same during the next downturn. Controlling for value, size, momentum, and quality, very few managers seem to consistently outperform. Sometimes it might make sense to get exposure to these styles through an index fund, which is likely to be cheaper. But it could still be worth hiring managers who don't show statistically significant evidence of skill, if they offer desirable style tilts that are hard to replicate.


 

ETFInvestor Newsletter
ETFInvestor Want to read more about ETF investing? Subscribe to Morningstar ETFInvestor for fresh ideas for income and total return plus a bird's-eye view of valuations around the globe, portfolio construction advice, and data on the biggest and most popular ETFs. One-Year Digital Subscription

12 Issues | $189
Premium Members: $179

Easy Checkout

Disclosure: Morningstar, Inc.'s Investment Management division licenses indexes to financial institutions as the tracking indexes for investable products, such as exchange-traded funds, sponsored by the financial institution. The license fee for such use is paid by the sponsoring financial institution based mainly on the total assets of the investable product. Please click here for a list of investable products that track or have tracked a Morningstar index. Neither Morningstar, Inc. nor its investment management division markets, sells, or makes any representations regarding the advisability of investing in any investable product that tracks a Morningstar index.

Sponsor Center