IMR asks Ben Siscovick, a General Partner of New York City-based IA Ventures (www.iaventures.com), about its investing in early-stage companies with the potential to create competitive advantage from “big data.”
Since 2010, IA Ventures has invested in 25 companies that develop tools and technologies for managing and extracting value from massive data sets. What is it about these companies that makes you confident they can turn passive data into active, money-making assets? What’s the common denominator?
There are some common themes even though each one is definitely doing something different. At the lowest part of the stack are what I call infrastructure groups. They are creating technology to manage large and more complex data sets, act on and analyze those data sets with speed and precision, and then facilitate distribution of the data across large clusters of computers. Another bunch of companies are doing things more at the application level. They are leveraging insight derived from the data and creating proprietary data assets from it.
Our newest investment is a company called MemSQL. MemSQL is an “in-memory’” SQL database that uses a traditional relational model for structuring and storing the data and traditional SQL for querying that data. By optimizing MemSQL as an in-memory database, it is making tremendous advances applying SQL in ways that allow it to be extremely, extremely fast.
There are a bunch of different industries in which the decision windows are shrinking so that you have to be able to intelligently sift through data-based decisions extremely quickly. If you’re a high-frequency trader, for example, or you’re an online ad server and you need to choose the right ad to serve to the right person at the right time with the right content, you have to be able to sift through a tremendous amount of data extremely quickly and in a structured way. MemSQL would be an example of an infrastructure-type company.
When you say a tremendous amount of data, are you talking about terabytes, petabytes?
It depends. We have companies that are focused on massive-scale data, petabytes, but MemSQL is not. This company is dealing with more medium-size data but in a way that is extraordinarily fast while simultaneously analyzing it in real time. Generally when we talk about large-scale data we are talking about a data set too large to fit on a single hard drive.
But when we say big data we don’t only mean massive scale. We also mean the complexity—whether it’s structured or unstructured—and the real time nature of the data. It’s some combination of these three: scale, unstructured, real-time.
What’s an example from your applications group of leveraging insight?
Place IQ is really fascinating. They aggregate geo data from as many sources as they possibly can. They aggregate it, they analyze it and they extrapolate context of location from that analysis.
Place IQ describes itself as “turning location into context—extracting intelligence and meaning from the ever-increasing mass of location-specific data being created.”
They’ll deal with anything that has geo data, whether it’s marketing data, weather data, crime data, events— any data they can get their hands on that has a geo tag so they know its location. They’re able to look in a very fine-tuned way at what type of people can be found where and when. For instance, they’re able to look at an individual city block and tell you that such-and-such type of people might be found there between 9 and 5 Monday through Friday, while on nights and weekends you might find a different type.
That data is used for things like store placement—where should I put my next store, where should I put my billboard advertisement? Of course, the Holy Grail is mobile advertising—being able to target the right ads to the right person at the right time in the right context.
Jumping a bit, is the key to gaining competitive advantage from big data what you call “productizing” the data? You’ve written that entrepreneurs need to be laser-focused on productizing data.
That is one key. The product of infrastructure companies, for example, is not the data but the thing that stores the data, that does something with the data, that manages the data. That’s a little different. When I talk about productizing data, I’m saying that data alone is very, very rarely a product and the key is what you do with the data that makes it important. There are a lot companies in our portfolio that have data assets and the key question is what the product they are creating is and the outgrowth of those data assets.
Tell us about another in this group.
We have a company in our portfolio called Next Big Sound. It serves the music industry. They started by aggregating social data on musicians and then they layered in event data like, for example, an artist going on the David Letterman show or putting on a concert. Then they started showing this to big music labels who said it was super interesting and important data but what would be really cool and impactful would be to take that data and overlay it against their transactional (i.e., consumption) data, their sales data, their iTunes purchase data, and Spotify listens data. That would be really, really insightful.
So the labels started pushing the transactional data into Next Big Sound. But here’s the point about productizing your data. So they have all this data, what do they do with it? The first product of Next Big Sound is a visualization dashboard that allows people to explore the data in a productive way—that allows you to discover correlations in the data that you would otherwise never have been able to see if you only had all this data on an Excel spreadsheet somewhere. So that’s a product, it’s productized as an exploratory dashboard.
The data product they’re working on now—the next suite of products— takes all the data they have and unlocks predictive or prescriptive correlations between the data. Right now I would describe their product as descriptive. What they’re working on is finding prescriptive, or predictive, insight by analyzing data and trying to understand correlations between social engagement and transactions—purchases. So that’s an example of a company that has its data assets and is in the process of productizing them in multiple ways.
You’ve also observed that data almost always requires some sort of “packaging” in order to become productized.
Correct. Data alone is very rarely a product. It’s what you do with it that makes a product useful.
Have any of your portfolio companies become profitable yet?
Yes, but profitability is not important at this stage. It’s all very early. We actually don’t necessarily want them to be profitable yet—they’re trying to invest in the business right now, so profitability is not a key metric now. These companies are all less than three years old.
Is there one that you’ve made a significantly bigger investment in than others?
There’s a handful that we’ve made pretty large investments in over time: multimillion-dollar investments. Most of our investments are between a million and a couple million, but there are some that are a couple hundred thousand.
Roger Ehrenberg, IA Ventures’ founder and managing partner, has blogged about data scientists vs. data entrepreneurs, whom he thinks of as more focused on application and competitive advantage. Is that your feeling?
They’re different but they’re also related. I think of a data entrepreneur as somebody who is extremely thoughtful about leveraging data to disrupt industries, somebody who has the vision to take this data and do great things with it. A data scientist is somebody who’s doing exploratory work on the data. But they’re not in conflict with each other. They can be the same person; they can be two different people.
You would agree these companies need both talents.
The second IA Ventures fund was launched earlier this year, so I presume that means you see many more opportunities in this field. Any guess about how big the big data market could be?
I don’t have a guess on the size of the market except that it’s massive. We think that every industry on the planet is going to have to figure out how to leverage their data to become better at what they do, whether it’s in healthcare, financial services, retail, marketing, insurance— you name it, data is going to be a major, major, major factor in the products and offerings of these industries.
So there’s going to be big money in big data?
I think so. We hope so. That’s the thesis.