Born from the need for data science tools: a look at Databricks’s history, key members, major achievements, and future

In this series of, we feature the most promising AI companies in the world: where do they come from, what have they achieved and what are their plans for the future. In this episode, we are looking at Databricks, a San Francisco, California-based data and AI company helping customers unify their analytics.

Data analytics and data management are now deeply essential to businesses. The demand for big technology continues to grow and it is not expected to stop anytime soon. The two stalwarts in this space are data cloud company Snowflake and data analytics company Databricks.

Databricks was valued at $38B when it raised $1.6B last year and these two companies are now increasingly locking horns with cloud platforms like Amazon, Google, and Microsoft. The story of Databricks is especially interesting as it expands its product portfolio and prepares to challenge Snowflake.

Born out of UC Berkeley

Databricks was started by Ali Ghodsi, Andy Konwinski, Arsalan Tavakoli-Shiraji, Ion Stoica, Matei Zaharia, Patrick Wendell, and Reynold Xin in 2013 as a web-based platform for working with Spark. The platform made by the creators of Apache Spark, an open-source unified analytics engine for large-scale data processing, provided automated cluster management and IPython-style notebooks.

The interesting thing about Databricks is that it grew out of the AMPLab project at University of California, Berkeley. The Delta Lake developed by Databricks is an open source project aimed at bringing reliability to data lakes for machine learning and other data science use cases.

In 2017, Microsoft announced Databricks as a first-party service on Microsoft Azure via the integration called Azure Databricks. In February 2021, Databricks announced working with Google Cloud to provide integration with the Google Kubernetes Engine and the BigQuery platform.

Databricks has said its cloud data analytics platform is used by more than 5,000 organisations. The platform is a proof of how the database war has essentially moved from on-premise battle to cloud-based services platform battle.

Databricks: key members

  • Ali Ghodsi (Co-founder and Chief Executive Officer)
  • Andy Kofoid (President, Global Field Operations)
  • David Conte (Chief Financial Officer)
  • Amy Reichanadter (Chief People Officer)
  • Trâm Phi (SVP and General Counsel)
  • Ron Gabrisko (Chief Revenue Officer)
  • Rick Schultz (Chief Marketing Officer)
  • Hatim Shafique (Chief Customer Officer)
  • Fermín Serna (Chief Security Officer)
  • Naveen Zutshi (Chief Information Officer)
  • Vinod Marur (SVP of Engineering)
  • David Meyer (SVP of Products)
  • Adam Conway (SVP of Products)

What products does Databricks offer?

While Snowflake offers data warehouse as a service, Databricks develops and sells a cloud data platform called “lakehouse.” The clever marketing name stands for combination of data warehouse and data lake. The lakehouse built by Databricks is based on the open source Apache Spark framework and it allows analytical queries against semi-structured data.

In June 2020, Databricks introduced Delta Engine as a new query engine that layers on top of Delta Lake to boost query performance. It was followed by the introduction of SQL Analytics as Databricks SQL. This is used for running business intelligence and analytics reporting on top of data lakes.

In addition to these core products, Databricks also offers a platform for other workloads, including machine learning, data storage and processing, streaming analytics, and business intelligence. It has also engaged in community building in the form of online courses about Spark and a conference called Data + AI Summit.

Databricks: timeline of key events

  1. 2013: Founded by the original creators of Apache Spark
  2. 2013: Raises $13M in a Series A funding round led by Andreessen Horowitz
  3. 2016: Raises $60M in Series C from New Enterprise Associates following a $33M Series B in 2014
  4. 2017: Announces partnership with Microsoft
  5. 2018: Hits $100M annual recurring revenue
  6. 2019: Raises $250M in Series E led by Andressen Horowitz with participation from Microsoft
  7. 2021: Databricks raises $1B Series G led by Franklin Templeton and $1.6B Series H led by Morgan Stanley

Databricks and the future of cloud big-data technology

If you are thinking about Amazon, Google, and Microsoft when hearing the word cloud then you are correct. However, if you are thinking of Databricks or Snowflake then you are definitely tuned with the big data technology world. The future of Databricks is one that will essentially shape the cloud-based data warehousing and data analytics industry.

Gartner estimates that the database management market has more than doubled to $65B from $25B between 2011 and 2020. The technology research and consulting firm also says that this growth is led by cloud databases. In the cloud database market, the competition is clearly between Snowflake and Databricks.

On one hand, Snowflake reported year-over-year revenue growth of 102 per cent while Databricks has raised a lot of money to challenge the likes of Snowflake and other big data startups. With IDC claiming that the database market will become a $100B industry by 2025, the cloud database market will become one of the largest and fastest growing in tech.

The central story of this growth will not be competition between Oracle and Microsoft but instead, it will be the competition between Databricks and Snowflake. The upstarts are increasingly trying to offer products that were previously the forte of others.

With technologies like data warehouses, data lakes, and artificial intelligence taking prominence, the race is on between Snowflake and Databricks to increase their market share. One of the predicted trends is that Oracle, IBM, and SAP will lose their stronghold to specialised data management companies like MongoDB, Databricks, Snowflake, Yugabyte, and others.

The success of Databricks can be owed to its original focus of fulfilling the need for data science tools. Its ability to offer a data lake house combining the elements of a data warehouse and data lake to analyse data was radically different and one designed to drive digital transformation. With Snowflake encroaching into its business, Databricks will not only need to challenge another upstart but also big tech giants like Amazon and Google.

What to read next?

1600 800 Editorial Staff
My name is HAL 9000, how can I assist you?
This website uses cookies to ensure the best possible experience. By clicking accept, you agree to our use of cookies and similar technologies.
Privacy Policy