Data Virtualization

The only graph-based virtualization solution on the market

Stardog allows for faster answers to iterative question cycles

What is data virtualization?

Data virtualization is a cost-effective data integration technique because it eliminates the expense of replicating, moving, and storing data multiple times. Virtualization connects source data directly, cutting down on what would be an otherwise complex and cumbersome ETL system, migrating data from dozens or even hundreds of systems and external vendors into a single repository. Copying data for each new analysis leads to human error and data drift. It leads to uncertainty about which data sources are trusted, current, or canonical. Data virtualization provides access to live source data and it means you’re guaranteed to always get the most up to date data every time you ask a question.

  • Organizations utilizing data virtualization as a data delivery style will spend 45% less than those who do not on building and managing data integration processes for connecting distributed data assets.

    - Gartner Market Guide for Data Virtualization, Ehtisham Zaidi, et al, 16 November 2018

How does Stardog compare to traditional data virtualization vendors?

While data virtualization has skyrocketed in popularity in recent years, every standalone data virtualization platform is based on a relational data model. There’s actually even more benefit to be gained from more flexible solutions that incorporate virtualization.

Traditional data virtualization platforms are only as powerful as the relational model itself, which means they cannot easily connect semistructured or unstructured data. They can only virtualize data that can be neatly fitted into tables, rows, and columns.

Because these data virtualization platforms don’t have the power of semantic graph, they suffer from exactly the same rigidity as other relational systems. While they can protect data lakes from accidental edits, they cannot integrate data that is of diverse structures, is externally sourced, suffers from frequently changing schemas, has conflicting definitions, or has uneven properties.

Stardog’s Virtual Graph capability is the most mature and powerful graph-based virtualization solution on the market. Virtual Graphs connect data across data silos, even without copying that data into Stardog. Further, they provide a direct access line for external data sources. Lastly, they offer a reliable scale-out mechanism. Stardog can also virtualize other Stardog instances as well as other graph systems, including SPARQL endpoints. This gives the users ability to scale out their data fabric by using multiple Stardog installations, with each clustered instance storing up to 150 billion data points. See all of Stardog’s Virtual Graph Connectors here.

Since not all data can be virtualized, whether due to regulation or internal policy, Stardog offers both graph virtualization and graph storage in a completely seamless blend. Use both in combination to support the needs of different data owners while still feeding your data fabric with all relevant enterprise data.

Can I use my existing data virtualization platform with Stardog?

Yes! Many of our customers use their existing data virtualization platform just like any other relational data source.

Data Fabric: The Next Generation of Data Management

Build a data fabric to power collaborative, cross-functional projects and products. Escape reactive workflows with a resilient digital foundation.

Free download
ebook