Standard RAG (Retrieval-Augmented Generation) has a fatal flaw: It blindly trusts the retrieval.
If your vector database returns outdated documents—or worse, irrelevant ones—your LLM will hallucinate a confident answer based on bad data.
Enter Corrective RAG (CRAG) and Self-RAG.
These advanced architectures introduce a “Self-Correction” loop. The system evaluates the quality of retrieved documents. If they are deemed “Ambiguous” or “Incorrect,” the agent automatically triggers a web search to find the truth.
In this guide, we will build the “Web Search Node” for a CRAG pipeline using SearchCans.
The CRAG Architecture: Traffic Lights for Data
Conceptually, CRAG acts like a traffic light for your data retrieval:
- Green (Correct): The vector DB documents are relevant. Generate answer.
- Red (Incorrect): The documents are wrong. Discard them and search the web.
- Amber (Ambiguous): The documents are vague. Combine them with a web search for clarity.
The Role of SearchCans
In the “Red” and “Amber” states, your agent needs to leave the internal database and check the open internet.
For a production agent that might loop and retry 10 times per query, you need an API that is:
Cost-Effective
SearchCans is $0.56/1k (vs $15+ for others).
Deeply Grounded
We don’t just return snippets. Our Reader API fetches the full markdown of the source page.
Implementation: The “Web Search” Node
Let’s implement the search component for a LangGraph workflow.
import requests
class SearchCansRetriever:
"""
A robust Web Search Node for CRAG/Self-RAG pipelines.
"""
def __init__(self, api_key):
self.api_key = api_key
self.search_endpoint = "https://www.searchcans.com/api/search"
self.reader_endpoint = "https://www.searchcans.com/api/url"
self.headers = {"Authorization": f"Bearer {self.api_key}"}
def corrective_search(self, query: str):
print(f"CRAG Triggered: Searching web for '{query}'...")
# Step 1: Search Google for fresh links
search_params = {
"q": query,
"engine": "google",
"num": 3 # Get top 3 candidates
}
try:
resp = requests.get(self.search_endpoint, headers=self.headers, params=search_params)
results = resp.json().get("organic_results", [])
if not results:
return "No relevant web results found."
# Step 2: "Deep Read" the top result to ground the answer
top_link = results[0]['link']
return self._read_content(top_link)
except Exception as e:
return f"Search failed: {str(e)}"
def _read_content(self, url):
print(f"Reading source: {url}...")
# Use Reader API to get Clean Markdown
read_params = {
"url": url,
"b": "true", # Use headless browser for dynamic sites
"w": 2000 # Wait for content to hydrate
}
resp = requests.get(self.reader_endpoint, headers=self.headers, params=read_params)
data = resp.json()
# Prefer Markdown for LLM context window efficiency
content = data.get("markdown", "") or data.get("text", "")
return f"WEB CONTEXT FROM {url}:\n{content[:5000]}"
Integrating with LangGraph
In a standard LangGraph setup, you would add this as a node. When the “Grader” node determines that internal documents are insufficient, it routes the state to this web_search node.
# Pseudo-code for LangGraph integration
def web_search_node(state):
question = state["question"]
search_tool = SearchCansRetriever(api_key="YOUR_KEY")
# Perform corrective search
web_context = search_tool.corrective_search(question)
# Update state with new, grounded knowledge
return {"documents": [web_context], "question": question}
# Add to graph
workflow.add_node("web_search", web_search_node)
Building a Document Grader
The grader evaluates document relevance using an LLM:
from openai import OpenAI
client = OpenAI(api_key="YOUR_OPENAI_KEY")
def grade_documents(question: str, documents: list) -> str:
"""
Grades retrieved documents for relevance.
Returns: 'correct', 'incorrect', or 'ambiguous'
"""
doc_text = "\n\n".join(documents)
prompt = f"""
Question: {question}
Retrieved Documents:
{doc_text}
Are these documents relevant and sufficient to answer the question?
Respond with ONLY: 'correct', 'incorrect', or 'ambiguous'
"""
response = client.chat.completions.create(
model="gpt-4o",
messages=[{"role": "user", "content": prompt}]
)
return response.choices[0].message.content.strip().lower()
Complete CRAG Flow
Here’s how the complete system works:
def crag_pipeline(question: str):
# Step 1: Retrieve from vector DB
vector_docs = vector_db.retrieve(question)
# Step 2: Grade the documents
grade = grade_documents(question, vector_docs)
# Step 3: Route based on grade
if grade == "correct":
context = "\n\n".join(vector_docs)
elif grade == "incorrect":
# Discard vector results, search web
searcher = SearchCansRetriever(api_key="YOUR_KEY")
context = searcher.corrective_search(question)
else: # ambiguous
# Combine both sources
searcher = SearchCansRetriever(api_key="YOUR_KEY")
web_context = searcher.corrective_search(question)
context = f"Internal:\n{vector_docs[0]}\n\nWeb:\n{web_context}"
# Step 4: Generate final answer
return generate_answer(question, context)
Why This Matters for “Self-Correction”
Self-RAG agents are designed to be autonomous. They critique their own outputs and iterate.
If your agent decides its answer is hallucinated, it must have a reliable way to get fresh data.
Without SearchCans
The agent is stuck with its internal training data (which caused the hallucination).
With SearchCans
The agent has a real-time “lifeline” to verify facts against the live internet.
Conclusion
Building a “Self-Correcting” agent is the hallmark of a senior AI engineer. It moves your system from a fun toy to a reliable enterprise tool.
By integrating SearchCans, you provide the affordable, high-speed infrastructure needed to support the multiple search-and-verify loops that CRAG architectures require.
Resources
Related Topics:
- AI Agent Internet Access Architecture - Architecture guide for connected agents
- URL to Markdown API Benchmark - Why clean data matters for RAG
- Adaptive RAG Router - Dynamic knowledge routing
- Deep Research Agent - Advanced LangGraph patterns
- Hybrid RAG Tutorial - Combining vector and web search
Get Started:
- Free Trial - Get 100 free credits
- API Documentation - Technical reference
- Pricing - Transparent costs
- Playground - Test in browser
SearchCans provides real-time data for AI agents. Start building now →