Property Graphs meet Stardog
Get the latest in your inbox
Get the latest in your inbox
You can attach properties to edges in your RDF graph with Stardog’s new edge property capability that bridges the gap between RDF graphs and property graphs.
Stardog allows users to connect and query disparate data sources using a unified graph data model that can be enriched with logical definitions and user-defined rules to encode business logic. The graph model used by Stardog is based on the Resource Description Framework (RDF) standardized by the World Wide Web Consortium (W3C).
RDF is a generic graph data model not much different than the property graph model popularized by graph databases such as Neo4J and Apache TinkerPop. RDF’s use of unique identifiers (IRIs), reusable data models (RDF schemas and OWL ontologies) and standardized mapping languages to bring external data sources (e.g. relational databases) into RDF (R2RML) sets it apart from property graphs to be the unifying access layer for disconnected data sources. Despite the advantages of RDF mentioned above, there has been one feature of property graphs that has not been directly representable in RDF graphs: edge properties.
One advantage of the RDF data model is its simplicity. The RDF graph can be defined as a set of edges (or triples in the RDF terminology). In RDF graphs there is no distinction between the attributes of a node or the relationships between nodes as in property graphs; everything is represented as edges. So the fact that a person works at a company is a simple edge between two nodes in the graph showing this relationship:
But, as we know, the real world is more complicated and there are many nuances to any given relationship. For example, there are temporal aspects of relationships; we might want to indicate the fact that Alice started working for ACME in year 2010. There is no direct way to do attach this information to an RDF triple. There are several well-known workarounds but they introduce additional nodes in the graph which might not be very intuitive and/or cause the graph size to increase dramatically.
The recent RDF*/SPARQL* proposals extend the RDF model to allow properties to be attached to triples. This way we can attach the temporal information to the worksFor
edge in our graph:
There are many different qualifications for relationships that could take advantage of edge properties. For example, in an earlier blog post we described how an organization extracted knowledge from unstructured news articles. There is inherent uncertainty associated with the ML and NLP algorithms involved in such a process and we might want to include this in the relationships extracted from text documents. We can include the source document as another edge property. Alice’s role in the company would be another property we can attach to the works for relationship:
It is also possible to use named graphs in RDF to attach these kind of metadata to a set of RDF triples, but edge properties make it possible to do this with a finer granularity.
One key difference of RDF edge properties compared to edge attributes in property graphs is the fact that the value of an RDF edge property is another node in the graph. In property graphs there is a clear distinction between edges and attributes and attribute values are just strings (or typed primitive values), not things – contrary to the main principle of knowledge graphs.
The values of RDF edge properties are nodes in the graph which means they can be either strings (i.e. RDF literals) or nodes in the graph (i.e. IRIs) that have further connections. For example, the CEO role in the above example would be an IRI in the graph that could be linked to additional context like a human-readable description, what the abbreviation stands for, different spellings of the term and maybe even a taxonomy of organizational roles. Similarly, there would be another node in the graph for the document referenced in the edge property and that document could be linked to the publisher which would have its own properties. The links between nodes in the graph is what makes it possible to contextualize the information in a knowledge graph so there is no reason for edge properties to be treated any differently.
When writing and querying edge properties, an extended RDF and SPARQL syntax is needed. The RDF*/SPARQL* proposal defines an extension to Turtle syntax and the above example would look as follows:
<< :Alice :worksFor :ACME >> :role :CEO ;
:since 2010 ;
:probability 0.8 ;
:source <http://example.com/news> .
This syntax works well most of the time but various Turtle shortcuts such as implicit bnodes ([]
), predicate lists (;
), and object lists (,
) cannot be used inside the << >>
patterns. This makes the syntax more verbose since the same subject has multiple outgoing edges, some of which have properties and some of which do not. To address this issue we support an alternative syntax in Stardog that puts edge properties next to the predicates.
The following example shows how additional edges and edge properties can be written for the Alice node using the Stardog syntax:
:Alice :worksFor {:role :CEO ;
:since 2010 ;
:probability 0.8 ;
:source <http://example.com/news>} :ACME ;
:birthDate {:probability 0.2} "1972-01-01"^^xsd:date ;
:nationality {:source <http://example.org/Alice>} :USA .
Both syntaxes have the same expressive power and can be translated to each other. We will continue contributing to the standardization efforts for RDF*/SPARQL* and update our syntax support as the specification evolves over time.
We will continue to improve our edge property support and plan to include support for edge properties in the context of inferencing, validation, and virtual graphs. I hope you try out the latest version of Stardog and let us know what you think about this new capability in our community forums.
How to Overcome a Major Enterprise Liability and Unleash Massive Potential
Download for free