Skip to Content
Fund Spy

How Fund Managers Are Using Big Data

With big data comes big questions.

“Big data” has real potential, but the challenge of using it well shouldn’t be underestimated. That’s the conclusion drawn from Morningstar’s deep dive into the topic. The rest of this Fund Spy explains what big data is, who’s leading the charge in its collection, and what investors should expect from it.

Big Data Basics
“Big data” is a catch-all term that covers a wide range of data sets that are too large for traditional data-processing software to handle. These include everything from data that's not in a numerical format--like text, audio, or images--to newer data sources like the location-tracking information captured by some of the world's favorite mobile phones and applications.

As companies look to learn more about their own businesses and industries, the availability of this type of data has exploded. Meanwhile, technology has evolved to make collecting and analyzing it possible. The size and variety of this information makes it impractical to manually evaluate the data using traditional processing software. Companies using these newer inputs must incorporate more-advanced methods like machine learning, which utilizes artificial intelligence to look for patterns and trends. It would take an analyst hundreds of hours to do what a computer can do in minutes.

The Adoption, Evolution, and Promise of Big Data
The appeal of exploiting new data sources has drawn some fund companies to add elements of big data into their own processes. However, two firms in the ’40 Act space stand out for how far along they are in their adoption. Both BlackRock's Systematic Active Equity team and Goldman Sachs' Quantitative Investment Strategy team have been using alternative data to pick stocks for the better part of the decade. Both teams started out in the hedge fund units at their respective firms, which helps explain why they were early adopters. Quantitative managers seeking high fees from hedge fund investors need to stay near the cutting edge to compete. Those higher fees also give these businesses a bigger budget to spend on staff and technology.

Of course, how they use unstructured data and how prevalent it is in their models has evolved significantly over time. For example, today about one third of the signals used to forecast stock prices in their models are based on nontraditional data; 10 years ago, none of these signals leveraged alternative data sources and only a handful composed the model.

Other heavyweight firms like Vanguard and T. Rowe Price have also begun exploring ways to use some of these new techniques, but it's not currently a significant component of their processes. Traditional fundamental shops have also explored nontraditional data sources to supplement their bottom-up, fundamental research efforts.

These alternative data sources offer information that can't be found in traditional company filings, conversations with management, technical analysis, or popular investment factors. The data is generally updated more frequently than traditional data. The idea is that with more useful and timely information, the models they use to select stocks can make better forecasts about future prices, or fundamental managers can supplement existing research and build out a more robust mosaic of information as they search for attractive investments. While more information can add more value, that is not a given.

As with any sounds-too-good-to-be-true financial innovation, people are right to be skeptical. Using alternative data is not necessarily going to lead to a sudden reversal in active managers' fortunes and turn them into alpha-generating machines. But it would also be naïve to dismiss big data's potential to incrementally improve a manager's process simply because it's new.

Like any input into an investment process, there are complexities. For one, a large quantity of data is informative, but it also introduces additional noise. While the increased availability of data can be a plus, quantity does not ensure quality, so making sure managers can sift through the data and capture insights, without overcomplicating their decision-making process, is key. Further, since the emergence of big data and nontraditional data sources is a relatively new phenomenon, it remains to be seen if some of these inputs have staying power and can add value over a full market cycle.

Having access to a large and unique data set isn't like finding a bag full of magic market-beating beans. The ability to access, test, validate, and implement this unstructured data into an investment process is critical to making the data useful. As a result, firms now not only have to attract strong investment talent--quantitative researchers with programming experience and data science backgrounds are also needed to aid in this effort.

Big data and data science are additional tools that are being increasingly adopted by firms as another means to get an informational edge. However, as with any input, the data is not infallible. Crowding and signal decay are ever-present risks to this style of investing, as are data mining or identifying patterns and relationships in the data that are not sustainable or informative. Further, relationships can change or even disappear over time as more investors pile into the same factors or exposures.

While the use of new data and tools is exciting, it is also untested over a full market cycle, and while we are optimistic on some of the benefits, we also maintain a healthy level of skepticism. Specifically, it is not just the availability of the data but what is done with it. Like managers that use traditional inputs, some will be more successful than others.

But the potential for new alpha sources is real. The rationale is that if information is widely available and easily obtained, it likely will not serve as a sustainable source of alpha. While the firms exploiting alternative data and delivering strategies in '40 Act vehicles do not rely on proprietary data sources, they do have unique ways of combining, parsing, and analyzing the data, which leads to differentiation in performance patterns and exposures. The early adoption of some of these techniques discussed here is only the beginning of a widespread use of nontraditional data and data science in the mutual fund space. Further, the systematic processes are scalable and thus large investment firms such as those discussed here can offer actively managed, systematic strategies at relatively low costs to traditional actively managed peers'.