25 Years in Semantics: Building the Cognitive Backbone of the Enterprise
Get the latest in your inbox
Get the latest in your inbox
I started in this field with a simple question. How can we build smart agents?
That was the question driving the AI lab at the University of Maryland when I got there, and it’s the same question I’m still working on 25 years later. We had data. We had logic. We had compute. What we didn’t have was meaning. Machines could process information, but they couldn’t interpret it, which is what a smart agent actually has to do.
The uncomfortable truth is that the semantic community has spent decades building things that never quite make it into production. We’ve become very good at modeling, but less effective at putting those models into operation. And that gap is where most enterprise knowledge graph projects go to die.
If your knowledge graph isn’t connected to live enterprise data, it isn’t infrastructure. It’s documentation that never made it into the business.
The vocabulary keeps changing. Expert systems. The semantic web. Knowledge graphs. Now context layers, semantic layers, knowledge layers. Every few years, a new term arrives carrying the same promise. This time, machines will finally understand what you mean.
The underlying gap doesn’t change. Your data is scattered across systems that define things differently. Your teams interpret the same numbers in incompatible ways. And your models, when you build them, get built in isolation and never wired into the work.
The skills gap made it worse. For most of those 25 years, the honest answer to “how do I model my enterprise domain rigorously enough to build on” was “hire a PhD ontologist.” That was true, and it killed adoption. It kept semantic technology in the hands of specialists and out of operational systems. The organizations that wanted the benefits did not have the expertise in house, and the people who could build it were too few to scale.
The problem itself hasn’t changed. It’s the same one we had in 2001. What’s different is the urgency. AI agents understand language remarkably well now. Better than any system we’ve ever had. What they don’t understand is what your business specifically means by “customer,” “commitment,” or “active account.” General language understanding isn’t organizational meaning. When the system generating answers picks the wrong interpretation, it stops being a theoretical issue. It becomes an operational problem, and it shows up in production.
The failures I see in the field look different on the surface. A knowledge graph project that never went live. A team that can’t agree on definitions. An AI deployment that gives confident, wrong answers. They appear to be separate problems, but they all stem from the same gap in shared understanding.
Face 1
The knowledge graph that was never mapped to real data.
Face 2
The organization where teams use the same word and mean different things.
Face 3
The AI agent that sounds confident and gets the business wrong.
The rest of this article is about each of those and what they share.
Face 1
You’ve seen this pattern. A team invests in modeling. They define classes and relationships. They produce a clean ontology, often a genuinely good one. Significant effort goes into it, but then it just sits there.
There’s no connection to the operational systems where the data actually lives. No integration with the applications people use to make decisions. The graph exists in a tool somewhere, accurate and useless. Teams go back to dashboards, reports, and manual interpretation. Whatever insight the graph was supposed to enable gets recreated by hand, poorly, in spreadsheets.
There’s a deeper version of this problem. Knowledge isn’t static. The business changes, the data changes, and the definitions drift. A graph that isn’t wired to live operational systems doesn’t just sit isolated. It slowly becomes wrong. By the time anyone asks it a real question, it’s describing an enterprise that no longer exists.
This keeps happening because modeling gets treated as the deliverable. It isn’t. Modeling is the starting point. The deliverable is operational, which means the graph has to be connected to live enterprise data, integrated with the systems people use, and exposed through interfaces that fit how the work actually gets done. None of it is glamorous, but all of it is necessary.
If your knowledge graph isn’t running within the flow of the business and changing with it, it remains a diagram rather than something the business actually uses.
Face 2
Here’s a small example that contains a large problem.
Sales says “commitment.” Legal says “commitment.” Finance says “commitment.” None of them means the same thing. To sales, it’s a verbal handshake at the end of a quarter. For legal, it’s a signed obligation enforceable in court. And for finance, it’s revenue that can be recognized under specific accounting rules.
The word is identical. What it refers to is not.
Multiply that across hundreds of terms in a real organization, and you have the operational reality of most enterprises. Your teams aren’t disagreeing on what to do. They’re using the same words with different meanings, and the gap only shows up after decisions are made on incompatible interpretations.
This is where ontology projects fall into the perfect-modeling trap. The instinct is to model harder, define every term precisely, and build the complete domain model. I’ve watched teams spend weeks debating when a “lead” becomes “qualified.” You can refine a definition forever and deliver no business value.
The real risk isn’t a knowledge graph being wrong, but invisible misalignment becoming an operational risk. Inconsistent decisions across functions. Conflicting answers to the same question. Erosion of trust in data, which is the slowest and most expensive thing to recover. Hallucination is the wrong thing to worry about. The wrong decision, made confidently, repeated across systems that all think they’re right. That’s the problem.
The instinct, when teams see this, is to get everyone in a room and force agreement on what “commitment” really means. That’s a coordination project, and coordination at enterprise scale is roughly impossible. Sales isn’t giving up their working meaning, and they shouldn’t have to.
The win isn’t coordination. It’s cooperation without coordination—a substrate where each function’s meaning is made explicit, the differences are visible, and systems can reason across them without anyone having to agree first. The model doesn’t settle the disagreement. It makes the disagreement legible, so downstream decisions stop pretending it isn’t there.
Face 3
Most enterprise AI today is built on text. Documents, PDFs, emails, unstructured content. Retrieval pulls passages from those sources, the model reads them, and an answer comes back. This works up to a point, but that point is reached quickly in any serious enterprise context.
Text is ambiguous by design. Meaning lives with the reader, not the page. When your agent reads a document about “active accounts,” it inherits whatever implicit assumption the author had in mind. If your finance team and your customer success team write about active accounts differently, your agent will produce answers that depend on which document it happens to retrieve. The answers will only sound consistent. They won’t be in reality.
Hallucinations get blamed for AI failures because they’re easy to identify. The agent gives a wrong answer, someone notices, and the issue looks simple. The harder problem is that the reasoning behind the answer is often unclear.
Even when the answer is technically correct, you don’t know which interpretation of which term it relied on. Ask for “Q3 revenue from active accounts,” and your finance system might define active as a paid subscription, while customer success defines it as a recent login. The agent picks one. The number is real. It just isn’t the one you needed. It can’t be audited or reproduced, so it can’t be trusted for anything that matters.
A quick note on the current vocabulary. Everyone has a “context layer” or “semantic layer” now. Most of what’s marketed under those terms is a curated view over data, often built on top of a BI tool. That’s useful, but it isn’t infrastructure that encodes meaning across the enterprise.
A view over a clean dataset assumes the meaning has already been settled. The problem is that it often hasn’t, and most context layer products don’t help you settle it. They help you query it after someone else has.
The semantic layer I’m talking about is not a knowledge graph. It includes one, but it goes beyond it. It’s the infrastructure that surrounds and activates the graph. Three things have to be true for it to do real work.
First, it has to connect to live data where the data lives. Not a copy, not a curated subset, but the actual operational systems that run the business. If the semantic layer is detached from the data, it stays a model, and we’re back to Face 1.
Second, it has to explicitly encode definitions and relationships. Not implicitly through dashboards or documentation. Explicitly, in a form that systems can reason over. This is what gives sales, legal, and finance a way to discover that they mean different things by “commitment” before that disagreement turns into a contract dispute.
Third, it has to expose that context to humans and agents through the same interface. Whatever an analyst can ask, an agent should be able to ask. Whatever an agent reasons over, a human should be able to inspect. Shared understanding only works if the understanding is genuinely shared, which means the access surface has to be common.
The analogy I keep returning to is renovation versus wiring. A lot of what’s sold as a semantic layer today is renovation. It makes the room look better. The semantic layer I’m describing is the wiring behind the walls. It’s what makes the lights work in every room.
For most of the last two decades, the answer to “how do I build a knowledge graph” was “hire someone who can.” That bottleneck is finally breaking, and not because more people are getting PhDs in description logic.
Automated ontology generation from natural-language descriptions is now a real workflow. AI-assisted mapping between data sources and models has gone from research to practice. Subject matter experts can describe what they know, and tools can scaffold the formal model from there. The work still requires judgment, but it no longer requires a specialist degree to start.
Let’s be direct about what’s happening. Knowledge graph construction is being democratized. A subject matter expert with the right tool can now produce a working ontology. That work used to require a specialist degree. The discipline isn’t being bypassed; it’s being encoded. Our automated ontology generation workflow integrates standard techniques like competency questions—the structured questions an ontology must be able to answer—directly into the workflow. Competency questions are the escape hatch from the perfect-modeling trap: instead of asking “what’s the perfect model?” you ask “what must the system answer?” What used to live in an ontologist’s head as methodology now lives in the tool as scaffolding. The bottleneck that kept this technology in the hands of a few is breaking. The rigor isn’t going with it.
The role itself is changing, not just who can do it. The new center of gravity isn’t modeling nouns. It’s modeling situations—how an agent interprets what it sees, when it escalates, what it approves, what it prioritizes. The work isn’t maintaining structure. It’s maintaining coherence between what the business means and what the systems decide. PhD ontologists move into the background, designing schemas and enforcing logical consistency. Domain experts move into the foreground as meaning stewards, working alongside agents to keep the shared reality intact.
The industrial era optimized labor. The information era optimized data. The agentic era, as far as I can tell, will be defined by shared understanding.
In practice, humans and agents reason over the same context. Decisions stay consistent across functions because the underlying definitions are consistent. AI systems can explain their answers because those answers are grounded in something that can be inspected.
The organizations that build this cognitive backbone now will set the terms for what comes next. The ones that don’t will keep rebuilding the same diagrams under new names, every few years, with a new vocabulary and the same gap.
25 years in, the lesson fits in one line. The point was never the graph. It was shared meaning. The graph is one way to encode it, and the encoding only matters if it connects to the business.
This is what we’ve been building toward at Stardog. A semantic AI platform that unifies enterprise data where it lives and adds the context AI needs to understand it, without locking the organization into a proprietary environment. The relevant point here is that the architecture exists. The constraint is no longer technical.
Here’s the reality check. Most organizations still can’t connect their data or align their definitions on the basics. There’s a lot of talk about fully autonomous agents making consequential decisions on their own, without humans in the loop. We’re trying to go to Mars before we’ve landed on the moon. The semantic layer is the moon. It’s the infrastructure that lets humans and agents work over the same shared context. Land there first.
Because the organizations that solve the meaning problem first are the ones that will get to define what comes next.
How to Overcome a Major Enterprise Liability and Unleash Massive Potential
Download for free