SearchCans

Build a Self-Correcting RAG (CRAG) Agent: Python & LangGraph Tutorial

Static RAG is brittle. Learn how to build a Corrective RAG (CRAG) pipeline that detects hallucinations and fixes them using SearchCans real-time search.

4 min read

Standard RAG (Retrieval-Augmented Generation) has a fatal flaw: It blindly trusts the retrieval.

If your vector database returns outdated documents—or worse, irrelevant ones—your LLM will hallucinate a confident answer based on bad data.

Enter Corrective RAG (CRAG) and Self-RAG.

These advanced architectures introduce a “Self-Correction” loop. The system evaluates the quality of retrieved documents. If they are deemed “Ambiguous” or “Incorrect,” the agent automatically triggers a web search to find the truth.

In this guide, we will build the “Web Search Node” for a CRAG pipeline using SearchCans.

The CRAG Architecture: Traffic Lights for Data

Conceptually, CRAG acts like a traffic light for your data retrieval:

  1. Green (Correct): The vector DB documents are relevant. Generate answer.
  2. Red (Incorrect): The documents are wrong. Discard them and search the web.
  3. Amber (Ambiguous): The documents are vague. Combine them with a web search for clarity.

The Role of SearchCans

In the “Red” and “Amber” states, your agent needs to leave the internal database and check the open internet.

For a production agent that might loop and retry 10 times per query, you need an API that is:

Cost-Effective

SearchCans is $0.56/1k (vs $15+ for others).

Deeply Grounded

We don’t just return snippets. Our Reader API fetches the full markdown of the source page.

Implementation: The “Web Search” Node

Let’s implement the search component for a LangGraph workflow.

import requests

class SearchCansRetriever:
    """
    A robust Web Search Node for CRAG/Self-RAG pipelines.
    """
    def __init__(self, api_key):
        self.api_key = api_key
        self.search_endpoint = "https://www.searchcans.com/api/search"
        self.reader_endpoint = "https://www.searchcans.com/api/url"
        self.headers = {"Authorization": f"Bearer {self.api_key}"}

    def corrective_search(self, query: str):
        print(f"CRAG Triggered: Searching web for '{query}'...")
        
        # Step 1: Search Google for fresh links
        search_params = {
            "q": query,
            "engine": "google",
            "num": 3  # Get top 3 candidates
        }
        
        try:
            resp = requests.get(self.search_endpoint, headers=self.headers, params=search_params)
            results = resp.json().get("organic_results", [])
            
            if not results:
                return "No relevant web results found."
            
            # Step 2: "Deep Read" the top result to ground the answer
            top_link = results[0]['link']
            return self._read_content(top_link)
            
        except Exception as e:
            return f"Search failed: {str(e)}"

    def _read_content(self, url):
        print(f"Reading source: {url}...")
        # Use Reader API to get Clean Markdown
        read_params = {
            "url": url,
            "b": "true",  # Use headless browser for dynamic sites
            "w": 2000     # Wait for content to hydrate
        }
        
        resp = requests.get(self.reader_endpoint, headers=self.headers, params=read_params)
        data = resp.json()
        
        # Prefer Markdown for LLM context window efficiency
        content = data.get("markdown", "") or data.get("text", "")
        return f"WEB CONTEXT FROM {url}:\n{content[:5000]}"

Integrating with LangGraph

In a standard LangGraph setup, you would add this as a node. When the “Grader” node determines that internal documents are insufficient, it routes the state to this web_search node.

# Pseudo-code for LangGraph integration
def web_search_node(state):
    question = state["question"]
    search_tool = SearchCansRetriever(api_key="YOUR_KEY")
    
    # Perform corrective search
    web_context = search_tool.corrective_search(question)
    
    # Update state with new, grounded knowledge
    return {"documents": [web_context], "question": question}

# Add to graph
workflow.add_node("web_search", web_search_node)

Building a Document Grader

The grader evaluates document relevance using an LLM:

from openai import OpenAI

client = OpenAI(api_key="YOUR_OPENAI_KEY")

def grade_documents(question: str, documents: list) -> str:
    """
    Grades retrieved documents for relevance.
    Returns: 'correct', 'incorrect', or 'ambiguous'
    """
    doc_text = "\n\n".join(documents)
    
    prompt = f"""
    Question: {question}
    
    Retrieved Documents:
    {doc_text}
    
    Are these documents relevant and sufficient to answer the question?
    Respond with ONLY: 'correct', 'incorrect', or 'ambiguous'
    """
    
    response = client.chat.completions.create(
        model="gpt-4o",
        messages=[{"role": "user", "content": prompt}]
    )
    
    return response.choices[0].message.content.strip().lower()

Complete CRAG Flow

Here’s how the complete system works:

def crag_pipeline(question: str):
    # Step 1: Retrieve from vector DB
    vector_docs = vector_db.retrieve(question)
    
    # Step 2: Grade the documents
    grade = grade_documents(question, vector_docs)
    
    # Step 3: Route based on grade
    if grade == "correct":
        context = "\n\n".join(vector_docs)
    elif grade == "incorrect":
        # Discard vector results, search web
        searcher = SearchCansRetriever(api_key="YOUR_KEY")
        context = searcher.corrective_search(question)
    else:  # ambiguous
        # Combine both sources
        searcher = SearchCansRetriever(api_key="YOUR_KEY")
        web_context = searcher.corrective_search(question)
        context = f"Internal:\n{vector_docs[0]}\n\nWeb:\n{web_context}"
    
    # Step 4: Generate final answer
    return generate_answer(question, context)

Why This Matters for “Self-Correction”

Self-RAG agents are designed to be autonomous. They critique their own outputs and iterate.

If your agent decides its answer is hallucinated, it must have a reliable way to get fresh data.

Without SearchCans

The agent is stuck with its internal training data (which caused the hallucination).

With SearchCans

The agent has a real-time “lifeline” to verify facts against the live internet.

Conclusion

Building a “Self-Correcting” agent is the hallmark of a senior AI engineer. It moves your system from a fun toy to a reliable enterprise tool.

By integrating SearchCans, you provide the affordable, high-speed infrastructure needed to support the multiple search-and-verify loops that CRAG architectures require.


Resources

Related Topics:

Get Started:


SearchCans provides real-time data for AI agents. Start building now →

Sarah Wang

Sarah Wang

AI Integration Specialist

Seattle, WA

Software engineer with focus on LLM integration and AI applications. 6+ years experience building AI-powered products and developer tools.

AI/MLLLM IntegrationRAG Systems
View all →

Trending articles will be displayed here.

Ready to try SearchCans?

Get 100 free credits and start using our SERP API today. No credit card required.