Data Science made easy, from ingest to production. Powered by Apache Spark™.
The complete solution for data scientists and engineers.
Effortlessly manage large-scale Spark clusters - Spin up and scale out clusters to hundreds of nodes and beyond with just a few clicks, without IT or DevOps. Easily harness the power of Spark for streaming, machine learning, graph processing, and more.
Accelerate your work with an interactive workspace - Work interactively while automatically documenting your progress in notebooks — in R, Python, Scala, or SQL. Visualize data in just a few clicks, and use familiar tools like matplotlib, ggplot or d3.
Run your production jobs at scale - Put new applications in production with one click by scheduling either notebooks or JARs. Monitor the progress of production jobs and set up automated alerts to notify you of changes.
Collaborate interactively - Seamlessly share notebooks, collaborate in the same code base, comment on each other’s work, and track activities.
Publish your analysis with customized dashboards - Build and articulate your findings in dashboards in a few clicks. Set up dashboards to update automatically through jobs.
Connect your favorite apps - Run your favorite BI tools or sophisticated third-party applications on Databricks.
Cluster Manager - Fully managed Spark clusters in the cloud help you focus on your data — not your operations.
Notebooks - An interactive workspace for exploration and visualization so you can learn, work, and collaborate in a single, easy to use environment.
Jobs - A production pipeline scheduler that helps you get from prototype to production without re-engineering. Proactive monitoring keeps your data pipeline running reliably.
3rd Party Apps- Run your favorite BI tools or sophisticated third-party applications on Databricks.