In this series of ai.nl, we feature the most promising AI companies in the world: where do they come from, what have they achieved and what are their plans for the future. In this episode, we are looking at Kaggle, a San Francisco, California-based platform for predictive modelling and analytics competitions and consulting.
Kaggle made its entry in the data science field in 2010 by offering machine learning competitions. The competitions organised by Kaggle were like spelling bee for ML enthusiasts and tested skills and talent in the data science field.
Since being founded in 2010, the platform has evolved a lot and has also been acquired by search giant Google. With headquarters in San Francisco, Kaggle now operates like an online community of data scientists and machine learning practitioners. As a private company, Kaggle raised around $16M and reshaped machine learning in unprecedented ways. Here is a look at the history and future of the Google subsidiary that turned ML into a competitive landscape.
A gaggle of data scientists
When Kaggle got its start in 2010, it was mostly described as a gaggle of data scientists. Kaggle was started by Anthony Goldbloom and Ben Hamner. The service got early and managed to stay ahead of its competitors by focusing on its specific niche. Within a few years, Kaggle became a household name for data science and machine learning competitions.
The key personnel at Kaggle included Goldbloom and Jeremy Howard. Nicholas Gruen was the founding chairperson of the company before being succeeded by Max Levchin. In 2011, Kaggle raised equity valuing the company at $25M, and in 2017, the platform announced that it surpassed 1 million registered users.
Kaggle called its registered users, Kagglers, and is now estimated to have over 8 million registered users spanning 194 countries. The biggest moment in Kaggle’s history came in March, 2017, when Google announced its plan to acquire the platform at Google Next event. Google’s acquisition was all about getting access to that devoted community and capped off Kaggle’s success as a platform dedicated to data science and machine learning.
Kaggle: key members
- Anthony Goldbloom (Co-founder and CEO)
- Ben Hamner (Co-founder and CTO)
- Jeff Moser (Chief Architect)
- William Cukierski (Head of Competitions and Data Scientist)
Kaggle: timeline of major events since its founding
- April 2010: Kaggle gets its official start
- November 2011: Kaggle raises funding through Series A
- March 2017: Google acquires Kaggle
- March 2017: Two Sigma Investments fund runs a competition to code a trading algorithm
- June 2017: Kaggle announces that it passes 1 million registered users
How does a competition work on Kaggle?
- The competition hosted on Kaggle begins with the host preparing the data and a description of the problem. The participants then experiment with different techniques and compete against each other to produce the best models. The submissions are made through Kaggle Kernels, through manual upload, or using the Kaggle API.
- The submissions are scored immediately for most competitions and this is done using predictive accuracy relative to a hidden solution file. The scores are also summarised on a live leaderboard.
- Once the deadline passes, the competition host pays the prize money to the winner for a “worldwide, perpetual, irrevocable and royalty-free licence […] to use the winning Entry.” This entitles the host to use the algorithm, software, and related IP developed.
Kaggle: impact on data science and the future
The impact of Kaggle in the field of data science in general and machine learning in particular cannot be explained in mere words. The competitions on the platform have ranged from enhancing gesture recognition on Microsoft Kinect to improving CERN’s ability to search for a Higgs particle.
The biggest impact of Kaggle competitions have come in the form of furthering the state of the art in HIV research, chess ratings, and traffic forecasting. In a competition hosted by Merck, Geoffrey Hinton and George Dahl used deep neural networks to win. Vlad Mnih, a student of Hinton, used deep neural networks to win a competition hosted by Adzuna.
Kaggle competitions have also paved the way for a number of academic papers published on the basis of findings done by the participants and the live leaderboard has often resulted in participants making their models better. Google made its intent abundantly clear when Fei-Fei Li, the then chief scientist of AI/ML at Google Cloud, announced the acquisition on Google Cloud Platform blog.
With the acquisition of Kaggle, Google has lowered the barrier to entry for future AI professionals. Li also mentioned the intent to make AI open to a growing network of application developers.
It was widely speculated that Google acquired Kaggle for its user base but in reality, the search giant gained access to the data generated by the competitions held on its platform. It also has access to Kaggle Kernels, which are environments used to store input, output, and code needed for each analysis.
The entire suite of AI tools offered by Google Cloud for developers and data scientists is essentially the future of Kaggle. Google learnt from Kaggle the dataset needed to build effective ML models and then built a tool that makes it easier for developers and data scientists to build their own models.
For Google, it was clear from the start that it wanted to be the provider of tools and infrastructure needed by AI developers. With Kaggle and Google Cloud, the search giant completed that dream with its AI tools considered to be the most robust and able to accelerate deployment and learning.
What to read next?
- 🕶️ Darktrace wants to be the torchbearer for AI-powered cyber defence: a look at its history, key members, and major achievements
- 🧠 DeepMind wants to build a general-purpose AI: a look at its history, key members, and major achievements like AlphaGo
- 🧑🏫 OpenAI wants to be the first to build artificial general intelligence: a look at its history, key members, and major achievements like GPT-3