GraphQL and Paths

By , · 10 minute read

Stardog 5.1 adds support for GraphQL, an expressive path query SPARQL extension, and stored functions.

Benefits Before Features

Before we dive into the technical details of the new GraphQL and path query support, lets look at the business value first.

GraphQL Benefit

More developers know and are learning GraphQL than all the graph query languages combined. Which is ironic since GraphQL isn’t really a graph query language. Yes, we know, software is weird! GraphQL in Stardog is the easiest and most gentle learning curve to the Enterprise Knowledge Graph and all the attendant data unification benefits it provides.

This will benefit our enterprise customers both now and in the future in several ways:

  1. Gentler learning curve means faster to production with existing developers and technical staff
  2. More developer tooling that just works with Stardog means increased productivity of developers and technical staff
  3. No need to couple some au courant Javascript framework directly to Stardog…GraphQL acts as a loose coupling between the front-end and all the enterprise data
  4. Loose coupling is important but so are short sight lines, and GraphQL fronting Stardog’s Knowledge Graph capabilities means shorter sight lines between the concerns of enterprise and data architects and UX/front-end developers. Everybody wins.

Path Queries Benefit

As detailed below, we’ve significantly beefed up the graph path capabilities of SPARQL in Stardog. What benefits accrue from that evolution? Essentially this means that pure graph problems which naturally have path-shaped answers are now directly expressible in SPARQL in Stardog, without having to jump into some other tool, API, or language.

This means that Stardog’s SPARQL is the most complete graph query language available, and it’s tied to the most complete data unification platform, too. So business problems that are naturally path-shaped (supply chain, as in NASA’s use of Stardog, as well as many regulatory compliance and related problems in finserv) are part of the Stardog Knowledge Graph.

Sometimes the answer you need is expressible as a simple fact: What is the social security number of person X? What is the risk factor of the investment bank’s Euros position? Stardog Knowledge Graph delivers those answers across all enterprise data silos. But in other cases the answer is not a simple fact but a collection of nodes and edges from the graph, that is, it is a path. For example, what does the supply chain look like for this batch of troublesome parts? Or how are these two financial organizations related to these five transactions and these three shadowy political figures?

Stardog is now the best platform available for answering both kinds of questions over all the distributed data silos.

And now that we’ve addressed the why, we can turn to the how.

Graph Queries and Languages

Stardog 5.1 includes an implementation of a SPARQL extension which we call path queries and previously wrote about our design here. SPARQL 1.1 added support for property paths which have been supported since Stardog 2.0. Property paths allow traversing the graph using regex-like patterns of predicates. However, property paths are not able to express some useful queries nor return the paths between pairs of nodes.

We decided to support GraphQL queries over RDF in Stardog, too. While it’s not as expressive as SPARQL, it provides an interface for cleanly describing the structure of graph patterns using a JSON-style template. And Facebook is going to make sure that everyone learns it. We like that. It can be learned quickly and provides access to applications that may otherwise not be able to easily interact with Stardog.

Path Queries

When faced with an analysis task, it’s sometimes necessary to be able to find paths between pairs of nodes. We can do this using SPARQL property paths but there are two significant requirements that property paths can’t satisfy: returning paths between nodes and constraining the edges in a path arbitrarily.

For example, given a database with :knows and :worksWith predicates, we can easily search for paths of people connected by either a “knows” or “works with” relationship. This can be expressed using a SPARQL property path:

?x (:knows|:worksWith)+ ?y

Given a set of results, we don’t know how ?x and ?y are connected. Is it a series of :worksWith relationships indicating that ?x and ?y potentially work together? Or is it a series of :knows relationships indicating that the two are potentially in the same social circles? How many edges are between the two and what if we wanted paths of limited length?

Stardog’s new path queries give us the machinery to do this and much more. We can express that as a path query:

PATHS START ?x END ?y VIA :knows|:worksWith

This query reads very naturally. We want to find paths starting on nodes bound to ?x and ending on nodes bound to ?y. We give a path expression (as used in property paths) or graph pattern upon which to traverse edges. Unlike property paths, the path can be a single variable which indicates that we want to traverse all edges between nodes and bind the edge predicate to the variable.

The interesting thing is that Stardog returns paths, in contrast to the pairs of bindings returns by the property path query:

$ stardog query -f text exampleDB "PATHS START ?x END ?y VIA :knows|:worksWith"

(:Alice)->(:Bob)

(:Alice)->(:Bob)->(:Charlie)

(:Alice)->(:Bob)->(:David)

(:Bob)->(:Charlie)

(:Bob)->(:David)

Here we see the intermediate nodes between each pair of connected nodes. We can also get a binding for each edge in the path:

$ stardog query -f text exampleDB "PATHS START ?x END ?y VIA ?p"

(:Alice)-[p=:knows]->(:Bob)

(:Alice)-[p=:knows]->(:Bob)-[p=:knows]->(:David)

(:Alice)-[p=:knows]->(:Bob)-[p=:worksWith]->(:Charlie)

(:Bob)-[p=:knows]->(:David)

(:Bob)-[p=:worksWith]->(:Charlie)

Path queries also support arbitrary graph patterns in the VIA clause. This means we can bind variables at intermediate parts of the path and constrain the traversed edges using graph patterns. Let’s examine an example from the documentation:

PATHS START ?x = :Kevin_Bacon END ?y = :Robert_Redford
VIA { ?film a :Film ; :starring ?x , ?y  }

We can recognize the general shape of this query. It’s similar to the previous path query examples. However, here we connect nodes using a SPARQL graph pattern. We want to find actors which are connected by having acted in the same film.

It’s possible to find these pairs using rules and Stardog reasoning but we still don’t know which movies connect the pairs of actors. The path query allows us to do both at once. The graph pattern in the VIA clause supports all SPARQL constructs including FILTER, OPTIONAL, BIND, etc.

Any variables bound in the graph pattern will be included in the edge between two nodes. In this case, we bind the film which connects two actors:

$ stardog query -f text exampleDB "PATHS START ?x = :Kevin_Bacon END ?y = :Robert_Redford \
                                   VIA { ?film a :Film ; :starring ?x , ?y  }"

(:Kevin_Bacon)-[film=:Apollo_13]->(:Gary_Sinise)-[film=:Captain_America]->(:Robert_Redford)

(:Kevin_Bacon)-[film=:Sleepers]->(:Brad_Pitt)-[film=:Spy_Game]->(:Robert_Redford)

(:Kevin_Bacon)-[film=:A_Few_Good_Men]->(:Tom_Cruise)-[film=:Lions_for_Lambs]->(:Robert_Redford)

Path queries allow much more powerful traversal patterns, extending SPARQL in a significant way. We also gain the ability to use the path between nodes in applications as it’s part of the result set. Path queries are quite versatile and support more features than illustrated here. Check out the complete documentation.

GraphQL Queries FTW

Switching gears, let’s take a look at a simpler language, GraphQL, which was designed and implemented first at Facebook to build generic REST backends to serve applications. It’s not a powerful graph traversal language but is a concise way to represent hierarchical queries. GraphQL returns JSON objects as results which means they are structured, not flat solutions or sets of triples as in SPARQL SELECT or CONSTRUCT queries.

GraphQL defines its own data model in terms of a schema. This contrasts with Stardog’s native RDF data model which is not necessarily governed by a schema. Stardog allows but does not require you to define a GraphQL schema. If you do, Stardog only uses it to validate queries and affect the translation between RDF and GraphQL values. Either way, Stardog automagically provides a GraphQL endpoint over your existing RDF graphs.

Let’s jump right in and look at some GraphQL queries using the example data provided in the documentation:

{
   Human(name: "Luke Skywalker") {
     id
     friends {
       id
       name
       homePlanet @optional
     }
   }
}

GraphQL uses a hierarchical structure, like JSON, to represent the query. Requested fields, and constraints on them, are given by what are called selection sets. This query specifies Human(name: "Luke Skywalker") as the outermost selection set, something that might be written in SPARQL as ?x a :Human ; :name "Luke Skywalker".

We’re looking for all humans with the name “Luke Skywalker”. In this case, there should be only one. There are nested selection sets here which say that, for all matching objects, we want the id and friends and for all friends we want their id, name, and, optionally, homePlanet. The @optional syntax is called a directive. In this query, we know not all friends have a homePlanet defined so we mark the field as optional. The result is a nicely structured JSON object:

{
  "data": [
    {
      "id": 1000,
      "friends": [
        { "name": "Han Solo", "id": 1002 },
        { "name": "Leia Organa", "id": 1003, "homePlanet": "Alderaan" },
        { "name": "C-3PO", "id": 2000 },
        { "name": "R2-D2", "id": 2001 }
      ]
    }
  ]
}

We can see that the result matches the structure of the input query. The resulting JSON object contains a data key, the value being an array of results from the query. If there were an error encountered during query evaluation, we would instead receive a JSON object with an errors key and the value would describe the errors and possibly their sources as locations within the query string.

Stardog’s GraphQL implementation uses the standard protocols and is compatible with the plethora of GraphQL clients including GraphiQL, client-side Javascript (XHR) and the many language-specific client libraries available to application developers. Check out the comprehensive documentation for more details.

Stored Functions

The last thing I want to mention is a new feature we added that allows re-using expression logic, as in FILTER and BIND operators, across queries. We call this feature stored functions. As the name implies, it allows you to define functions that are stored in Stardog and can be used in SPARQL queries and rules and path queries.

As an example, let’s imagine that we have a pricing formula for holiday discounts on products. We want to offer an early bird discount to loyal customers or those who shop before December 10th. We can define a stored function to compute our discounted price:

$ stardog-admin function add \
  'function discountPrice(?price, ?isLoyal) \
  { if(xsd:date(now()) < "2017-12-10"^^xsd:date || ?isLoyal, ?price * 0.8, ?price) }'

This function is now available for use in SPARQL queries and can be replaced without requiring any changes to queries. This helps standardize and centralize application logic and build higher-level queries against the Knowledge Graph. Here’s how we might use our function:

select (?price as ?originalPrice) ?discountPrice where {
  :Widget :price ?price.
  ?cust a :Customer ; :name "John Doe"
  BIND(EXISTS { ?cust a :LoyalCustomer } as ?isLoyal)
  BIND(discountPrice(?price, ?isLoyal) as ?discountPrice)
}

And our result:

+---------------|---------------+
| originalPrice | discountPrice |
+---------------|---------------+
| 100           | 80.0          |
+---------------|---------------+

We can see that the purchase from this customer meets the requirements for the discounted price.

Stored functions are a small piece of the puzzle, but can significantly reduce code maintenance across many queries using the same building blocks. Stored functions are further described in the the documentation.

Onward and Upward

As demonstrated here, we won’t stop till you all rock a Knowledge Graph. We’re building out Stardog 5 every day, bringing new features and scalability improvements in every release. Stardog 5.1 is available now.


Top