Analytics provider Databricks has raised $33 million in Series B funding. New Enterprise Associates led the round with participation from Andreessen Horowitz.
Berkeley, Calif. (June 30, 2014)—Databricks, the company founded by the creators of Apache Spark—the powerful open-source processing engine that provides blazingly fast and sophisticated analytics—announced today the launch of Databricks Cloud, a cloud platform built around Apache Spark. In addition to this launch, the company is announcing the close of $33 million in series B funding led by New Enterprise Associates (NEA) with follow-on investment from Andreessen Horowitz.
“Getting the full value out of their Big Data investments is still very difficult for organizations. Clusters are difficult to set up and manage, and extracting value from your data requires you to integrate a hodgepodge of disjointed tools, which are themselves hard to use,” said Ion Stoica, CEO of Databricks. “Our vision when founding Databricks was to free users to focus on turning data into value, instead of struggling with existing tools and systems. Databricks Cloud delivers on this vision by combining the power of Spark with a zero-management hosted platform and an initial set of applications built around common workflows.”
Databricks Cloud is powered by Spark, a unified processing engine that eliminates the need to stitch together a disjointed set of tools. Spark provides support for interactive queries (SparkSQL), streaming data (Spark Streaming), machine learning (MLlib), and graph computation (Graphx) natively with a single API across the entire pipeline. Additionally, Databricks Cloud reaps the benefit of the rapid pace of innovation in Spark, driven by the 200+ contributors that have made it the most active project in the Hadoop ecosystem.
The hosted platform also dramatically simplifies the pain of provisioning a Spark cluster. Users simply specify the desired capacity of a new cluster, and the platform handles all the details: provisioning servers on the fly; streamlining import and caching of data; handling all elements of security; and continually patching and updating Spark—freeing users of all the typical headaches and allowing them to explore and harness the power of Spark. The platform is currently available on Amazon Web Services, though expanding to additional cloud providers is on the roadmap.
Databricks Cloud comes with a set of built-in applications for those eager to immediately begin using Spark to access and analyze data to better compete in the marketplace:
o Notebooks. Provides a rich interface that allows users to perform data discovery and exploration and to plot the results interactively, execute entire workflows as scripts, and enable advanced collaboration features.
o Dashboards. Create and host dashboards quickly and easily. Users can pick any outputs from previously created notebooks, assemble these outputs in a one-page dashboard with a WISIWYG editor, and publish the dashboard to a broad audience. The data and queries underpinning these dashboards can be regularly updated and refreshed.
o Job Launcher. Enables anyone to run arbitrary Apache Spark jobs and trigger their execution, simplifying the process of building data products.
“One of the common complaints we heard from enterprise users was that Big Data is not a single analysis; a true pipeline needs to combine data storage, ETL, data exploration, dashboards & reporting, advanced analytics and creation of data products. Doing that with today’s technology is incredibly difficult,” continues Stoica. “We built Databricks Cloud to enable the creation of end-to-end pipelines out of the box while supporting the full spectrum of Spark applications for enhanced and additional functionality. It was designed to appeal to a whole new class of users who will adopt Spark now that many of the complexities of setting up and using Spark have been alleviated.”
Beyond the built-in applications, Databricks Cloud enables users to seamlessly deploy and leverage the rapidly growing ecosystem of third-party Spark applications. Databricks Cloud is powered by the 100 percent open source Apache Spark, meaning that it will support all current and future “Certified on Spark” applications out of the box, and that all applications developed on Databricks Cloud will work across any of the “Certified Spark Distributions.”
“Databricks remains committed to supporting the open-source vitality that made Spark a rich and vital big data processing tool and a valuable commodity to big data users,” said Matei Zaharia, CTO of Databricks. “We will continue to commit significant resources to drive open-source innovation in Spark alongside the community. Furthermore, we look forward to enabling a whole new set of users and developers to experience and leverage the power of Spark to drive enterprise value.”
Databricks Cloud is currently in limited availability with several beta users. Databricks is gradually opening up more capacity so visit www.Databricks.com<http://www.databricks.com/> to learn more about the platform and to get on the waiting list for getting access to the platform that is redefining how enterprises utilize Big Data.
Databricks (databricks.com<http://databricks.com/>) was founded by the creators of Apache Spark, and is using technology based on years of research to build an advanced platform for analyzing and extracting value from Big Data. They believe Big Data is a tremendous opportunity that is still largely untapped, and are working to revolutionize what enterprises can do with it. They are venture-backed by Andreessen Horowitz and NEA.