Mastiff Storage Engine

By , · 4 minute read

Stardog has a new storage engine called Mastiff. Let’s say hi.

A while ago we talked about how we’re moving to a new storage engine for Stardog. Then we proceeded to bury our heads for several months to work through the details of changing a storage engine in a database, so you didn’t hear from us. Now we’re emerging because the new engine is almost ready and will be coming soon to your favorite Knowledge Graph.

We’ve spent 18 months on this project. There was a lot to do. Major architectural changes to the underlying storage implementation impacted all the layers built on top of it. At least that’s what happened here. Next time we’ll know better. Here are some of the major items we’ve taken on in order to deliver our cool new storage engine.

Before the details of what, let’s look super quickly at why: historically Stardog was heavily biased in favor of read performance. Writes weren’t slow but they were definitely subordinate to read performance, especially concurrent writes. After being in the market for a while we realized that we should make reads and writes more closely balanced and that we wanted a new, more performant storage engine, too.

And one more thing: the new storage engine, which is based on RocksDB, opens up a new development phase over the next 18 months, during which we will look at new features like

  1. incremental backup
  2. geo-replication
  3. statement identifiers
  4. bitemporal queries
  5. options to tune Stardog for flash storage
  6. options to tune Stardog for read, write, or balanced priority
  7. a generally lock-free system

Rebalancing Reads and Writes

First, we’ve moved from a B-Tree storage engine to an LSM tree. This has resulted in considerably better performance of small writes. The largest gains are seen with batches fewer than a few thousand tuples. We also see significant performance improvements for write concurrency. Write performance improvement was a large motivating factor for adopting a design based on an LSM tree.

We’ll have more to say about write performance as we move from closed alpha to open beta availability.

MVCC For the People

Second, we’ve moved to a multi-versioned concurrency control (MVCC) model. We will be providing full support for Snapshot Isolation-type transactions in the same manner as other MVCC databases. The move to an MVCC architecture provides a fully lock-free transaction system. Okay, there are still some latches lying around—concurrency is hard!—but there are no transaction locks to worry about. This has enormous impacts throughout the system, but the main benefits are pretty obvious: multi-writer concurrency is considerably improved and the chance of bugs is reduced, leading to more stability.

In fact, because of the lock-free transactions, we are able to dramatically reduce the cost of commits in Stardog Cluster. By eliminating a contention point in the cluster, we were able to leverage the new storage system to increase both raw speed and aggregate throughput. If you are writing a large amount of data, you can be comfortable knowing the amount of time spent blocking other writers is reduced substantially.

Those two changes would be big enough on their own to justify a lot of the effort, but we don’t like to stop while we’re ahead. So we did some other cool stuff, too.

Caching Logic

We also moved our storage caching logic out of the JVM. We’ve always made heavy use of data caching for performance, but we’ve also always had the fundamental problems that come with caching data inside the JVM. The JVM’s garbage collection algorithms make it difficult to increase the heap size and, thus, to increase cache size without negatively impacting performance. In the worst cases this meant that Stardog wasn’t able to make full use of the underlying hardware characteristics. We had put in place several clever solutions allowing us to move data offheap during computation, but we were still limited by the JVM’s heap limits when it came to data storage.

No more.

With the new storage engine, we use off-JVM memory to store cached data from the LSM tree. This means that we are able now to make full use of larger-memory hardware systems at the storage layer. Bring on those big RAM machines, we can take it.

Conclusion

Overall, there are huge changes in our new storage engine. We’ll be rolling that engine out over the next few months, including a closed alpha for customers in July, and we’ll be bringing some new features and information along with it. Stay tuned.

Download Stardog today to start your free 30-day evaluation.


Top