Data and analytics teams are dealing with challenges across enterprise ecosystems from monetizing data, to fast-tracking AI, to managing privacy. Gartner has long touted Data Fabric as the tech architecture of choice to achieve these complex data and analytics goals, the topic having received over 3,000 inquires through Gartner over the past year alone.
Perhaps it should come as no surprise that at this year’s Gartner Data & Analytics (D&A) Summit, the “secret ingredient” to a modern Data Fabric was underscored as none other than a knowledge graph.
Beyond its role in Data Fabric, knowledge graph technology has been powering modern data and analytics data architectures in enterprise organizations, serving as a competitive advantage, for some time now.
What is a “Metadata-driven Data Orchestration Powered by Knowledge Graph”?
So what is a knowledge graph, and why is it such a key piece to a “metadata-driven data orchestration” in an enterprise tech stack or Data Fabric? A knowledge graph is a semantic data layer that sits across your existing organizational data assets and connects them with meaning, allowing data consumers to easily search and discover related concepts that are not necessarily directly linked, thus wouldn’t be readily identified in a relational data format. Knowledge graphs have been a competitive advantage, kept mostly under wraps, across industries from Life Sciences to Finance to Manufacturing, and more.
The Gartner D&A Summit featured the emergence of knowledge graph from under the surface, revealing a key part of the digital transformation iceberg. As enterprises are completing composable data and analytics applications, including the trifecta builds of data storage / business intelligence (BI) tools / and data catalogs, they have come out realizing these pillars of data management require a semantic layer that both connects and ascribes business meaning to data—context that is relevant to data consumers and enables adaptive and intelligent decision making about the business. And as enterprises move to create proactive data solutions, data products, data monetization — and to overall future-proof their data for disruptive times — a flexible semantic data layer powered by a knowledge graph is breaking through the noise as the technology of choice to modernize data and analytics.
Knowledge Graph Closes the Gap Between Data and Decision, Empowering Data Consumers
An enterprise knowledge graph connects data based on business meaning rather than storage location in order to democratize data access and insight to everyone in the organization, not just to the people with SQL and data notebook skills, all while protecting the integrity of the underlying data. This accelerates insights across an enterprise, closing the gap between data and decision.
Stardog’s Enterprise Knowledge Graph platform connects, maps, and models data based on business meaning rather than storage location. No-code interfaces help data consumers view inferences and apply business logic. Governance tools help IT manage security, privacy, and compliance.
Why is Knowledge Graph Purpose-Built to Deliver Semantic Meaning vs. Graph Database?
By calling out knowledge graph specifically (vs. graph database), and underscoring business meaning as the driving force, Gartner may be signaling the key role semantic data layers will play in modern data and analytics architectures.
Historically, the term graph has referred to 2 different types of solutions:
- LPG (Labeled Property Graphs) - good for storage; many-to-one relationships; relationships need to be assigned to each data point; only allows for single schema; business logic has to be explicitly added using code
- RDF (Resource Descriptive Framework) - connects data across silos; captures many-to-many relationships; relationships can be inferred; allows for multiple schemas/views over the same data; can easily accept new business rules/logic without changing underlying data
Knowledge graphs are built on RDF for good reason. Knowledge is messy, and any given concept can mean different things to different people, carry layers of associations, and be connected to a multitude of other concepts. Given these complexities, capturing knowledge in a machine-readable format can be nearly impossible without the right tool. Knowledge graphs are purpose-built to achieve this goal.
This is why Stardog was built on RDF open standards, developed to represent large-scale information systems and noted for their expressiveness in capturing relationships. Data is linked rather than summarized in edge properties, allowing for multiple data definitions to serve different applications. In this way, RDF serves as a lingua franca over all the data, agnostically connecting data from various locations and formats and enabling powerful inference capabilities that promote discovery of relationships previously hidden, all without moving or changing the underlying data.
Here are some top differentiators of knowledge graphs:
- Inference capabilities - Knowledge graphs are built on RDF standards, linking data in triples to allow for discovery of related concepts previously not explicitly connected in the data.
- Multiple schemas - Multi-tenancy schemas are flexible schemas that allow for multiple definitions of the same data, allowing applications to access the same data through different lenses—without requiring new copies of the data—so data consumers can map, explore, and model their data through the framework of business concepts that are meaningful to their specific use cases or goals.
- Business logic - In LPGs, business logic has to be explicitly added using code. Thus the structure of the graph is tightly coupled to the application, i.e., the specific use case, it serves. If you change the physical structure of the graph, you have to update your application. Compare this to RDF (i.e., knowledge graph), where business logic is defined declaratively, rather than procedurally in the codebase, which leads to a less tightly coupled infrastructure.
A mature and hardened knowledge graph platform like Stardog provides all the functional benefits highlighted above, but is built to operate seamlessly within modern data architectures to support enterprise requirements like:
- Data Integration and Connectivity to over 150+ source systems - from structured (Relational, Application) to semi-structured (Document stores) and unstructured (text)
- Authentication and Authorization with Open Identity Frameworks like OAuth2, Kerberos, LDAP, MS Active Directory, Keycloak, PrivateLink with fine-grained security to adhere to Attribute-based Access Policies (ABAC) of an organization
- Scalability and Latency support with clustering, caching, read replicas and geo-replication architecture strategies
- Deployment and Integration across multi-cloud, hybrid or on-prem infrastructure with modern DevOp support for Dockerization and Kubernetes
- Low-code, No-code user interfaces designed for specific personas from Business Analysts to Data Modelers and Data Engineers
- Data Access Options for SQL users, python users or application builders via REST and/or GraphQL
Data v. Metadata - A Simple Exercise
When we consider the importance of having a layer of abstraction between the data and the metadata, and how that can make the difference between finding a relationship between various data points, or missing it, consider how data is structured in different graph types.
As an example, “BWI” is an airport, but it’s also a train station. There might be a third use case where you don’t care about the distinction between the two and simply want to see both as transportation hubs.
In LPG, you have to index BWI as all three things. Each time you label it it’s another write to a separate index. There’s no hierarchy. You can’t say “all airports are transportation hubs.” You have to indicate that on every single airport. There’s no layer of abstraction between the data and metadata. While the data in LPG is a graph, the metadata is flat. In RDF (i.e., knowledge graph), inference allows the graph to interpret that both airports and train stations are types of transportation hubs via business logic.
Now replace “transportation hubs” in this exercise with your data. This powerful inference capability is how pharmaceuticals are getting through R+D faster, or manufacturers are proactively addressing supply chain issues in real time, and how banks are preventing fraud before it happens.
Knowledge graph is the secret ingredient in a modern data architecture because it’s purpose-built for the future of data and analytics, no matter how innovative or disruptive that comes to be.
Learn more about knowledge graphs:
- Check out our Head of Products’ take on things in this blog: Turn these Five Friction Areas into Data Sharing Opportunities
- Read up on strategies to upgrade your analytics and AI: Analytics Modernization White Paper
- Try Stardog Cloud for free today using one of our starter kits