Big data is a catch-all word, but all current trends revolve around analyzing large amounts of data: sensor data, wearable computing, the internet of things, and predictive analytics. Marc Andreessen, the famous Silicon Valley VC, calls Data Scientists the new rockstars and expects the startups that he invests in to have a world-class plan to collect and analyze data. Greylock Partners, another VC, has a Data Scientist in Residence to assess the proposition and skills of startups. More and more, derivatives of data ARE the product. A good example is Distimo, one of the more successful UtrechtInc alumni.
Data by Design
Most Google services are free because your data IS the product. Target, a US retailer ventures into mobile payments to get more data on their customers. Gild scrapes from Stackoverflow and Github to help HR departments find talented coders. I encourage startups to review their platforms and apps for hidden opportunities. Startups and large corporations alike are not leveraging their data enough. Quality software engineering is often the weapon of choice to beat the incumbent, but data is ignored or abused. The data that you collect is a function of your app or platform. The design of your product determines what data you’ll be aggregating. Unique data-driven features and complete business models are often missed. A lot of applications will require longitudinal data, so you want to plan ahead—just like with most other matters in life.
You wouldn’t accept Google to just show you a bar diagram of the e-mails you ‘spammed’. You expect them to learn from everyone’s data what spam is and act on it. People that can extract predictive value from large amounts of data can make a big impact on the bottom line and bring in a competitive edge. Netflix was one of the first to create a competition to improve the algorithm behind their recommender system. The business case is simple, the quality of their recommender system is the main driver of customer retention and up-selling.
Kaggle for Startups
One way to get your hands on a data wizard for your startups is through Kaggle. Kaggle is a platform where organizations like Facebook and GE put large amounts of data and launch predictive modeling contests. The person or team that comes with the best solution can sometimes win up to $3M. For a small amount, Kaggle offers startups the possibility to crowdsource ideas and solutions.
Guest post by Koen Havlik, he is a Partner & Data Scientist at Algoritmica. Periodically, he organizes the Data Science NL meetup, hosted at UtrechtInc.
Image credit: Barbu doru