How to Debug Reasoning

Jul 11, 2017, 6 minute read

Get the latest in your inbox

Reasoning in Stardog is a powerful tool. Here are some hints to make it easier.

Intro

As Stardog’s lead support engineer, when an end user has a problem with the software, it crosses my desk at some point during its lifetime. Suffice to say I have seen all varieties of issues come in: from simple misunderstandings to mundane one-line fixes to major bugs.

In this post I want to cover one of the most common topics that I see, and that’s reasoning. Which makes sense: reasoning is a very big component of a knowledge graph platform, and one with a lot of specifications, twiddly knobs, and gotchas. I’ll be summarizing the most common types of questions that people have and the most common answers, too.

Note that Stardog includes both logical and statistical reasoning capabilities. In this post I’m focusing only on logical reasoning.

Background

Chances are pretty good you know what reasoning is, what it does, and why it’s important. But just in case, let’s take a very quick refresher on what reasoning is:

• Reasoning is a declarative way to either derive new nodes and edges in a graph or specify integrity constraints on a (possibly distributed) graph or to do both at the same time.
• Reasoning can replace arbitrary amounts of complex code and queries.
• Reasoning transforms a graph data structure into a knowledge graph.
• Reasoning in Stardog is fully integrated into SPARQL query evaluation.

A Motivating Example

Take, for example, the following trivial graph Turtle format:

``````:Square rdfs:subClassOf :Shape .    # This says that All Squares are Shapes
:MySquare a :Square .
``````

Any plain graph database can store these 3 nodes (`:Square`, `:Shape`, and `:MySquare`) and 2 edges (`rdfs:subClassOf` and `a`).

Reasoning is the software service that lets Stardog infer some new (implicit, i.e., unstated) information from the known (i.e., explicit) data. In this case, the inference is just one new edge between two existing nodes:

``````:MySquare a :Shape .
``````

Stardog doesn’t store inferences by default. Rather Stardog infers them on the fly as needed when answering queries. That’s important because what if the different parts of this graph are distributed over enterprise data silos and need to stay there?

The Usual Suspects

You can find repeated types of questions in the Stardog Community forums where users aren’t seeing expected query results. These often come down to a reasoning setting or misunderstanding of how it works. Here are a few of the most common questions we have seen.

“I’m not seeing any results!”

The most simple problem to fix, and by extension the easiest thing to check when things aren’t working, is the case where reasoning isn’t enabled at all. “But wait a minute,” I hear you ask, “If reasoning is so important, why would it ever NOT be enabled?”

One answer is that it can be expensive. But the other answer is that you should use reasoning in a way that makes sense for your use case. Stardog does not materialize (i.e., explicitly calculate and store) inferences, instead finding them as needed at query time. Therefore if a query doesn’t need reasoning to get the required results, it makes no sense to make everyone else pay the cost of computing.

If your problem is that query results only contain information that is explicit, this could be the problem. The method of enabling reasoning for your queries depends on how they’re being run:

• CLI: Ensure that `stardog query` is passed either the `-r` or `--reasoning` flag.
• Java: When creating a `Connection` object via `ConnectionConfiguration`, ensure that the `reasoning()` method is called with a `true` value
• HTTP: Ensure that the `reasoning` query parameter is present in your URL, or form body, with the value `true`
• Web Console: Ensure that the Reasoning toggle to the upper-right of the query textbox is set to `ON`.

If this works, then congratulations! If not, read on.

“I’m not seeing the right results!”

Okay, so reasoning is enabled, but what if you’re still not seeing the results that you know you should be seeing? It could be related to reasoning level or to the schema location.

Reasoning Level

You may not see expected results because the wrong reasoning level is being used. A profile or “reasoning level” is a bundle or family of data modeling features (called, for historical reasons, “axioms”) that are often used together. Some levels are more expressive (and thus more expensive) than others, so you want to choose the cheapest one that works. Stardog supports the following reasoning levels: RDFS, QL, RL, EL, DL, SL, and NONE. If you are missing results that you know should be there, check the `stardog.log` file.

Often when we receive issues like this, the log file will contain lines that look like this: `Not a valid SL axiom: Range(...)`.

Typically this means that the reasoning level is set to SL (the default), but the user has included OWL DL axioms, which are not covered by SL. When `stardog.log` shows lines like this, the implication is that the axiom(s) in question will be ignored completely, which is often the reason for the “missing” results, as they depended on the axiom.

By default Stardog uses the SL level because it’s the most expressive level that can be computed efficiently over large graphs. You can use the `reasoning schema` CLI command to see which axioms are included during reasoning.

The easiest solution may be to enable the database configuration option `reasoning.approximate` which will, when possible, split troublesome axiom(s) into two and use the axiom that fits into SL level. You can also try using Stardog Rules. Then you can look at rule gotchas to see if there are any issues with how you’re using rules. If you have a very small amount of data, you may try using the DL reasoning level.

Schema Location

Another cause we’ve seen for not seeing the expected results is connected to where the Stardog schema is in the graph. The schema here is just the set of axioms you want to use in reasoning. But, as mentioned above, those can be distributed (physically) so Stardog will work hard to find them.

Practically this means that Stardog needs to know which named graph(s) contain the schema. So you may need to check the value of `reasoning.named.graphs` property in `stardog.properties` to the correct value.

Our documentation has a detailed discussion of other reasons you might not be seeing the results you want. It’s a good read.

“My schema NEEDS axiom X and axiom Y!”

Maybe? But maybe not. Stardog Rules are very powerful and are only getting easier to write. Think of Stardog Rules as Datalog in the graph because Stardog Rules are (basically) Datalog in the graph. Like Datalog, it was never just for Datomic bro!

We’re always on the lookout for ways to improve them and their syntax, so if you find some axioms that can’t be expressed through rules, let us know!

Conclusion

Reasoning is part of what lets Stardog transform standalone data into a true Enterprise Knowledge Graph.

But with great power comes lots of questions from eager users. I hope this post helps you jump over some of the most common hurdles that users come across when using it. Either way, I look forward to hearing from you about it!