Ed Navas - Founder Three Cliffs AI
Last month, we introduced Smart Search, a new feature on the Three Cliffs AI platform that's set to transform how legal professionals navigate case documents. The response has been phenomenal, and we're thrilled to hear about the diverse success stories from our customers.
One question that has come up frequently is, "Why didn’t you use RAG (Retrieval-Augmented Generation) for your search functionality?" It’s a great question and one worth exploring.
(If you're not familiar with RAG, don't worry—this isn’t a deep dive into technical jargon. Simply put, RAG is a method that applies LLM models to specific data, like your internal case documents. You can ask questions, and it will generate a response based on those documents, using its training to understand the content and context.)
At Three Cliffs, our focus has always been on solving real user problems and making thoughtful technology choices. Through our conversations with litigation attorneys and support staff, we identified three key needs:
Supporting precise, targeted searches
Ensuring complete and accurate information retrieval
Preserving efficient, familiar workflows
It became clear that while RAG is a popular choice, it wouldn't fully meet the needs of the community we serve.
In this post, I'll walk you through our decision-making process and explain why we chose hybrid semantic search over RAG. We’ll delve into what RAG is, its benefits and limitations in legal search, and the unique advantages of a hybrid approach. Ultimately, this decision shapes the future of Three Cliffs AI and the broader legal tech landscape.
"One of the most common mistakes I see companies make is to jump straight into building an AI system without first figuring out what problem they're trying to solve".
— Andrew Ng
Understanding RAG - The AI Darling
Before we dive into why we chose hybrid semantic search, it's crucial to understand what RAG is and why it has become such a hot topic in the tech world (feel free to skip to the next section if you’re familiar with RAG).
What is Retrieval-Augmented Generation (RAG)?
RAG, or Retrieval-Augmented Generation, is a technique that combines the power of large language models (like GPT) with information retrieval systems. Here's how it works in simple terms:
Retrieval: When a user inputs a query, RAG first searches through a large database of information to find relevant documents or passages.
Augmentation: The retrieved information is then used to "augment" or enhance the AI's knowledge base for this specific query.
Generation: Finally, the AI generates a response based on both its pre-trained knowledge and the newly retrieved information.
Think of RAG as a super-smart assistant that can quickly look up relevant information and then craft a response that seamlessly integrates this new knowledge with its existing understanding.
The Allure of RAG: Why It's Captivating the Tech World
RAG has become incredibly popular in recent years, and for good reason:
Dynamic Knowledge Base: Unlike traditional AI models that are limited to the data they were trained on, RAG can access and utilize up-to-date information, making it more flexible and current.
Improved Accuracy: By grounding its responses in retrieved information, RAG can potentially provide more accurate and relevant answers compared to pure language models.
Transparency: RAG systems can often provide citations or references to the sources of information used in generating a response, increasing trust and verifiability.
Customizability: Organizations can use their own databases with RAG, allowing for highly specialized and proprietary knowledge bases.
Natural Language Interaction: RAG enables more natural, conversational interactions with information systems, potentially making complex data more accessible to non-technical users.
Reduced Hallucination: By anchoring responses in retrieved information, RAG aims to reduce the problem of AI "hallucination" - the generation of plausible-sounding but incorrect information.
The potential applications of RAG are vast, from powering more intelligent search engines and chatbots to assisting in research and data analysis across various fields. In the legal tech world, the promise of RAG to quickly sift through vast amounts of case law, statutes, and legal documents and provide coherent, relevant responses is undeniably attractive.
However, as we'll explore in the next section, the very features that make RAG so appealing in many contexts also present significant challenges in certain legal search contexts.
Supporting ‘Small Searches’
As mentioned in the RAG primer above, RAG (Retrieval-Augmented Generation) can feel like magic. It’s a sophisticated tool that uses complex algorithms to match your search query with stored chunks of information, and then runs these matches through a language model to generate a coherent response. But this very process—where RAG excels in finding patterns and creating context—can also be its downfall.
Imagine you're searching for something extremely specific: the name of a judge, a Bates number from a set of documents, or a medical acronym from a malpractice case. These are what we call ‘small searches,’ or low context searches. Here, the goal isn’t to generate a well-rounded response or create a narrative but simply to locate a piece of information —quickly and accurately.
RAG, however, struggles with these small searches. Why? Because its strength lies in finding broader patterns and constructing responses based on them. When tasked with finding a single word or phrase, RAG may miss the mark because it’s not designed to zero in on isolated terms with minimal context. Instead, it attempts to construct meaning around the term, which can dilute the accuracy of the result.
Think of RAG as a master storyteller. It’s fantastic at weaving together threads of information to create a compelling narrative. But if you ask it for a specific fact, like a date or a name, it might give you a beautifully told story—but with the wrong details. In contrast, a hybrid search system is more like a skilled archivist—capable of understanding long and complex searches while also pulling the exact information you need from across a set of documents without embellishment.
While our customers often need to conduct complex, content-rich searches, they also require the ability to make these ‘small searches.’ They need to find that one word or number hidden in a sea of data without the risk of RAG’s creative process altering or omitting critical information. This was a significant factor in our decision not to implement RAG, as it simply wasn’t reliable enough for how customers intended to interact with their documents.
Information Completeness
For our customers, confidence in the completeness of returned information is non-negotiable. They need to trust that when they conduct a search, every relevant piece of information is surfaced—not just the highlights. It’s not enough for a system to return some of what they’re looking for if it risks missing other critical details. Our research revealed that users are more willing to sift through some irrelevant information if it means ensuring nothing important is overlooked. In technical terms, they prioritize recall (accuracy) over precision.
RAG (Retrieval-Augmented Generation) presents a fundamental challenge in this regard. The process works by identifying the most relevant ‘chunks’ of data from a stored dataset and then generating an answer based on those selected chunks. While this can be effective, the reliability of the results hinges on several factors: the chunking strategy (how data is divided and stored), the definition of similarity between chunks, and how well the system anticipates user search behavior (Ben Hoyle has a good write-up on this topic).
Even with a well-considered chunking strategy and optimized search processes, this approach has inherent limitations. Poor ‘recall’ —where crucial information is missed—remains a significant risk, and while RAG can reduce hallucinations it by no means eliminates them. The system can still generate content that appears plausible but is not grounded in the actual data. The magnitude of these shortcoming is now well documented, a Stanford study found legal RAG tools to hallucinate 1-in-6 times. Isha Marate wrote a recent article that did a nice job covering this 'RAG Is Far From Magic and Prone to Hallucinations'.
Consider a scenario where an attorney is preparing a motion for summary judgment and needs to identify all relevant deposition testimony related to a specific fact in the case. If the AI-generated search response inadvertently omits key pieces of testimony or, worse, introduces fabricated details that were never part of the original depositions, the consequences could be severe. Critical facts could be missed, weakening the argument, or misleading information could be included, potentially jeopardizing the case. In high-stakes litigation, such oversights are not merely inconvenient—they can be disastrous.
The good news is advancements are being made—such as reranking selected chunks and introducing additional retrieval mechanisms—these are incremental improvements rather that do not ameliorate the risk. However, the core issue remains: RAG’s dependencies and its generative nature make it inherently prone to overlooking crucial information, which is simply unacceptable for our use case.
The core issue remains: RAG’s dependencies and its generative nature make it inherently prone to overlooking crucial information, which is simply unacceptable for our use case.
With hybrid semantic search, we’re still tapping into AI to understand your input, but we go a step further. Instead of relying on predetermined top matches, we ensure that every relevant piece of information has the chance to surface. When you use 'Smart Search,' you'll see a ranked list of relevant snippets from your case documents, complete with page and line citations and quick links to the original sources. It’s designed to feel as intuitive and seamless as using Google search.
Supporting Workflows
In legal technology, user experience is crucial. Many RAG tools use a chatbot interface, where AI-generated responses drive the interaction. While this approach has benefits, including the addition of citations for credibility, it changes the user's role from information seeker to information validator.
Moreover, the chatbot interface itself represents a dramatic shift in user experience, especially in the legal field. Law, as a profession, tends to be more conservative in adopting new technologies. Many attorneys are accustomed to traditional search interfaces and document management systems. Introducing a chatbot - a conversational AI - into their workflow is not just a small step, but a leap into a radically different way of interacting with information.
The power of these models is clear, but the need to double-check each response when the goal is to locate information, often hinders rather than helps workflow. The unfamiliarity of a chat interface compounds this issue, potentially slowing down processes that have been refined over years of practice.
"The problem with AI isn't just that it can be wrong, but that it can be wrong in ways that are incredibly persuasive."
— Jaron Lanier
In our conversations we also noticed a hesitancy over something we’re calling "AI deference" - where legal professionals were actually concerned they might overly trust AI outputs (this is not unfounded, something we'll save for a future post). These outputs can seem plausible and well-reasoned, but may miss crucial details or even introduce errors (hallucinations). This isn't just about accuracy; it's about maintaining user autonomy and preventing a cycle of trust and verification that complicates the work process.
Legal professionals are used to high standards of evidence and precision. The idea of an AI system potentially misinterpreting or overlooking key information understandably makes them uneasy. It's not about resisting change, but about meeting the user where they are at and ensuring their work remains accurate and efficient.
We challenged ourselves to create an experience that used AI at its core, but felt more familiar to use. We focused on enhancing user control, leveraging LLM capabilities without disrupting established workflows. Hybrid Semantic Search provides direct access to relevant information, ranked by relevance, not reinterpreted by AI. This approach empowers our users to build their cases confidently, using clear, unaltered data. It respects both the legal process and the user's expertise, reinforcing their role as the final decision-maker in their work.
Importantly, we designed our interface to feel familiar and intuitive to legal professionals, bridging the gap between advanced AI capabilities and the tried-and-true methods that lawyers trust. This balance allows for the adoption of new technology without the jarring transition to an entirely new way of working.
Wrapping Up
We built Smart Search with user needs in mind—focusing on accuracy, completeness, and keeping things familiar. While RAG is an exciting and evolving technology, it just didn’t meet the specific demands of the precise, detail-oriented work you do.
Our goal is to make your legal research smoother, not more complicated. That’s why Smart Search, along with our interactive document summaries, is designed to fit right into your workflow. If you’re interested in seeing how these tools can make a difference, we’d love for you to check out our platform or reach out to us.
We wanted to share our journey and thought process with you, hoping it sheds some light on why we made the choices we did. Smart Search was built with user needs in mind—focusing on accuracy, completeness, and keeping things familiar. While RAG is an exciting and evolving technology, it just didn’t meet the specific demands of the precise, detail-oriented work you do.
Our goal is to make legal workflows smoother, not more complicated. That’s always the approach we try and take, whether its with Smart Search or our other features like interactive document summaries. If you’re interested in seeing how these tools can make a difference, we’d love for you to check out our platform or reach out.
Comments