Using RAG for Question Answering in Regulated Industries is a Bad Idea
Get the latest in your inbox
Get the latest in your inbox
Using RAG only as the basis of serious GenAI apps that run inside of enterprises in regulated industries is a very, very dumb idea. Maybe it’s impolite to use the word “dumb” but I mean it quite seriously. Also, for the record, ideas are dumb; people are un- or misinformed but never dumb. And since I’m defining terms:
There are lots of fun RAG uses in B2C:
But I care about B2B and, more specifically, the impact and use of GenAI and KG on heavily regulated industries and companies of $1B or more revenue in financial services, life sciences, and manufacturing.
TLDR: The current rage for RAG in LLMs and GenAI is a very dumb idea for regulated enterprises. DO NOT DO THIS.
There are three reasons why RAG is a dumb idea in regulated industries.
In every endeavor, regulated, commercial, or otherwise, we use data to manage or eliminate uncertainly. Our uses of data ought not increase net uncertainty and that’s the first reason not to RAG in regulated industries.
Regulated RAG means a pernicious choice for users and the org itself:
Either you suffer the problems of subtle hallucinations misleading people in high stakes contexts or you do that for awhile with the result that you’ve systemically undercut the epistemic value of GenAI by creating nearly undetectable confabulations and thereby increased net uncertainty about the epistemic value of a strategic investment in GenAI.
The technical term for this situation is an unholy shit show. Hey, I just work here, I don’t coin the technical terms! The real problem is threefold:
On balance it would not be prudent to use GenAI-from-RAG in these environments. And about prudential standards…
What’s the big deal, right? So occasionally the GenAI box of magic beans gets an answer wrong. People get answers wrong often. Yes but in a regulated industry the Sovereign obligates people to exhibit standards of care, that is, regulated industries are ones in which the Sovereign demands people do their very best to get answers right or be able to show they’ve taken all reasonable precautions to get the answer right.
That is, there has to be a showing that the regulated biz spent $$ to mitigate downside risk.
The result is that most regulated industries overspend on getting things right because they’re required to; they’re not required to overspend but they overspend because of (1) uncertainty, (2) complexity, and (3) external-imposition of high standard of care. For example, in a globally significant bank, the regulated portion overspends to prudently follow standards of care and that effectively bids up the price of everything that the bank does. All of which is horribly unproductive.
RAG makes it all worse because if the box of magic beans that’s intended to make all of this cheaper and also simultaneously better sometimes just randomly makes up very plausible nonsense and you know that this happens but not when, then it’s not clear you’ve met standards of care by using this box of magic beans. And it’s not clear where you’ve gotten things wrong because of it and that tends to lead to bad outcomes when the Sovereign’s agents come to hold you to account for what you’ve done to satisfy their requirements.
It’s easy to be cynical about regulation. But forget politics and the Sovereign for a minute. What if you’re a drug scientist and you just really want to help children who have eye cancer not… to have eye cancer? A more laudable goal I cannot imagine and I am down to help them eradicate childhood eye cancer! Or maybe the question is about whether you should pull that jet engine now or after the next trip and then that unseasonably aggressive jet stream causes the metal fatigue to…you get the idea.
That is, what if you have the courage in our strange world to want to get the answer right because it matters? RAG is bad because regulated industries are the ones where getting it right really matters to human flourishing and all of us non-sociopaths should care about that. A lot.
Okay so what’s the alternative?
So let’s push GenAI out of the enterprise? No way, I really believe in the power of this stuff, but like all boxes of magic beans you’ve got to be smart about how you use them.
We can speculate about why RAG is all the rage—IMO, it’s one part A16Z being wrong early on cementing RAG and vector databases in GenAI reference architecture, and it’s two parts “the relational data model is still dominant but very wrong for GenAI”—but happily there is a perfectly sane alternative to RAG in regulated industries.
It’s called Semantic Parsing: rather than trusting anything an LLM says or showing a user anything an LLM says directly, instead you center a Knowledge Graph and use LLM to (1) figure out what the user is asking of the data; (2) turn that algorithmically-derived expression of human intent into a structured query against the KG; (3) take the KG’s answer, which is based on (a) known, (b) trusted, (c) lineaged, (d) timely data sources, and decorate or embellish it with other stuff; and, finally, (4) put this dialogue into a big context window and robust working memory so that the user has context-rich interactions with database-resident enterprise data.
That’s just better in every way including in the failure mode. You see “standards of care” is really all about managing the failure mode. Let’s compare RAG’s failure mode to SP’s:
As explained above, if the hallucinations were grotesque this failure mode would be annoying, but it wouldn’t be pernicious. “Oh look,” the user would think, “the box of magic beans is being incredibly stupid again, let’s just agree to ignore it for a minute.”
SP fails by failing at #1, that is, the LLM either cannot turn user input into a structured query, in which case nothing bad happens. Or the LLM turns user input into the wrong query relative to the person’s intent but the answer to that ‘wrong’ query is still correct with respect to the data. But that failure mode is exactly the same as what happens every day when some BI tool or dashboard app is giving the right answer to the wrong query. In short, someone notices and fixes it.
Semantic Parsing dominates RAG in the GenAI solution space inside regulated industries in the game theoretic sense. It is always a better choice in every game. The smart money—Databricks, Snowflake, SalesForce, Microsoft, etc—is on SP and is only using RAG for low-stakes contexts (customer support of various types) where an occasional confabulation is kinda expected or banal.
Be smart and place your GenAI regulated industry bets against RAG and for Semantic Parsing.
How to Overcome a Major Enterprise Liability and Unleash Massive Potential
Download for free