Stardog, the leading Enterprise Knowledge Graph platform, is excited to work with Databricks and its customers to create the ultimate semantic data layer for the modern data and analytics stack, including, critically, data lake, and lakehouse architectures — all to accelerate time to insight. As such, we’ve been working closely with Databricks on a number of strategic product initiatives, about which we will have more to say in a few weeks.
This initiative led us to sponsor and attend the recent Data & AI Summit in San Francisco in late June, where we brought an excited Stardog team to spread the word about the semantic layer.
Here is a trip report of the highlights of a very busy week.
To Democratize Insight, Move the Enterprise from Columns to Concepts
Over 6k technologists — every one of which, vaxxed and pumped — gathered at the Moscone Center in San Francisco from 27 to 30 June. Pandemic be damned! The Summit kicked off with Databricks CEO Ali Ghodsi telling a strategic product story featuring the Databricks Lakehouse, including its strong adoption across the industry, as well as TPC benchmarks showing performance and cost benefits that put it far ahead of several unnamed competitors, which we will leave as an exercise to the reader to identify!
That’s important traction and points up how Stardog’s “golden triangle” product strategy — briefly, unifying data storage, governance, and analytics with a semantic layer — is aligned with a big wave in the industry. Ali also stressed that 80% of Databricks customers have adopted a multicloud strategy, which also aligns with our goal to integrate and unify data at the computation layer (of the modern data stack) rather than at the storage layer only. Stardog believes that widespread adoption of the hybrid multicloud increasingly makes data location less important than data meaning, which requires data integration platforms to follow suit.
Ali welcomed a series of Databricks founders and leaders to the main stage in a blitzkrieg of product announcements, including:
Spark Connect: With Spark Connect, users can access Spark from any device. The client and server are decoupled in Spark Connect, allowing developers to embed Spark into any application and expose it through a thin client. This client is programming language-agnostic, works even on devices with low computational power, and improves stability and connectivity.
Delta Lake 2.0 is now fully open sourced. That drew huge applause from the audience, presumably in response to the growing popularity of Apache Iceberg.
Databricks Unity Data Catalog went GA, providing a unified governance layer for all data and AI assets. It creates a single interface to manage permissions for all assets, including centralized auditing and lineage. This solidifies Databricks as a serious enterprise data platform for organizations to manage their most important data assets securely.
This is one that Stardog is especially excited about since standalone or “pure play” data governance solutions are less attractive than data catalogs that live, at least conceptually, very close to the data.
Unity also provides better insight into an otherwise vast resource of tables and columns across the Lakehouse.
We were very excited that Stardog was one of a handful of early partners announced at the summit supporting integration with Unity as part of a modern data stack. We’re embracing Unity Data Catalog and aligning it with Stardog’s own recently launched Knowledge Graph Catalog because doing so benefits our mutual customers in several ways:
- As we help Databricks users build a semantic data layer, which means moving their view of enterprise data from tables (only) to tables unified by a graph, that is, from “Columns to Concepts,” the Stardog platform must remain in sync with the Lakehouse. We facilitate that synch-up by consuming and acting on Unity Data Catalog metadata.
- Our Virtual Graph capability — which lets Stardog users query and join across data lake, Lakehouse, and non-Databricks enterprise data — will consume Unity Data Catalog metadata to enable automated knowledge graph mappings, which will accelerate time to insight and increase user and customer joy.
- Our golden triangle strategy, to unify data storage, governance, and analytics, becomes operationally simpler and more elegant for our users since now Databricks has added a second point of the triangle with Unity.
Better together indeed! Convergence across vendor partners driven by deep and innovative product strategy will put the Databricks ecosystem, with Stardog as a critical player, ahead of the game, and our mutual users and customers will be the ultimate beneficiaries. That leads me to the next announcement at the summit.
Partner Connect went GA as well. With a partner-first approach, Databricks is fast expanding its ecosystem, bringing new and exciting capabilities that operate seamlessly with the Lakehouse, enabling customers to deploy a complete solution to fit their needs and enable fast time to value. Stardog is in process to become a certified partner in the coming month.
The convergence of Data and AI as the new paradigm was certainly well-received and applauded by the 6k present in the Moscone Center. Databricks has mapped out a path to the future, and we are well-aligned with our partner Databricks’ worldview. The only way to truly democratize data access and, thus, drive insights across modern enterprise is to re-present tabular data at the storage layer in terms of a graph of business meaning at the computation layer. To democratize insight, we have to move the enterprise from columns to concepts. That’s what a real semantic data layer is, and that’s the Stardog value proposition in the Databricks Data Lake and Lakehouse ecosystem.
That Databricks ecosystem is growing quickly, and we’re very happy to be part of it. The world of data and AI is just getting hotter, and we can’t wait to see what’s next.