SearchCans

From RAG to AI Agents | Search APIs Powering AI

How SERP APIs enable real-time knowledge for LLMs. Build AI agents with internet access. RAG patterns, practical code examples. Google AI researcher insights.

4 min read

When I was at Google working on LLM research, we had one major problem: language models only know what they were trained on. Ask GPT-4 about something that happened yesterday, and it’s clueless.

The solution everyone’s building now? Connect LLMs to search APIs for real-time information.

This post explains how it works and shows you the exact code to build it yourself.

Practical Guides: Build AI Agent Tutorial | Advanced Integration | API Docs

The Knowledge Cut-off Problem

Every LLM has a training cut-off date. GPT-4’s knowledge ends in April 2023 (for the base model). Ask it:

“What’s the current weather in San Francisco?”

It can’t answer. The information doesn’t exist in its training data.

Traditional solutions were clunky:

  • Fine-tune the model (expensive, slow)
  • Maintain a massive up-to-date knowledge base (engineering nightmare)
  • Accept the limitations (bad user experience)

The Better Solution: Retrieval Augmented Generation (RAG)

RAG is a fancy term for a simple idea:

  1. User asks a question
  2. Search for relevant information
  3. Feed that information to the LLM as context
  4. LLM generates answer using both its training AND the fresh data

Search APIs make step #2 trivial.

The Basic Architecture

Here’s the simplest implementation:

import requests
import openai

def answer_with_search(question):
    """Answer questions using LLM + search API"""
    
    # Step 1: Search for relevant information
    search_response = requests.get(
        'https://www.searchcans.com/api/search',
        headers={'Authorization': f'Bearer {SEARCHCANS_KEY}'},
        params={'q': question, 'engine': 'google', 'num': 5}
    )
    
    search_results = search_response.json()
    
    # Step 2: Extract relevant snippets
    context = "\n\n".join([
        f"{result['title']}\n{result['snippet']}"
        for result in search_results.get('organic_results', [])[:3]
    ])
    
    # Step 3: Feed to LLM with context
    prompt = f"""
    Answer the following question using the provided context.
    
    Context from recent search:
    {context}
    
    Question: {question}
    
    Answer:"""
    
    response = openai.ChatCompletion.create(
        model="gpt-4",
        messages=[{"role": "user", "content": prompt}]
    )
    
    return response.choices[0].message.content

# Test it
answer = answer_with_search("What happened at OpenAI in November 2024?")
print(answer)

This works surprisingly well. The LLM now has access to current information from the search results.

Production-Grade Implementation

The basic version has issues:

  • No error handling
  • Wastes tokens on irrelevant context
  • No source attribution
  • Can’t handle complex queries

Here’s how we build it properly:

import requests
import openai
from typing import List, Dict

class AISearchAgent:
    def __init__(self, search_api_key, openai_api_key):
        self.search_key = search_api_key
        self.openai_key = openai_api_key
        openai.api_key = openai_api_key
    
    def search(self, query: str, num_results: int = 5) -> List[Dict]:
        """Execute search and return structured results"""
        try:
            response = requests.get(
                'https://www.searchcans.com/api/search',
                headers={'Authorization': f'Bearer {self.search_key}'},
                params={'q': query, 'engine': 'google', 'num': num_results},
                timeout=10
            )
            response.raise_for_status()
            return response.json().get('organic_results', [])
        except Exception as e:
            print(f"Search failed: {e}")
            return []
    
    def is_relevant(self, result: Dict, original_query: str) -> bool:
        """Use LLM to check if search result is relevant"""
        prompt = f"""
        Query: {original_query}
        Result Title: {result['title']}
        Result Snippet: {result['snippet']}
        
        Is this result relevant to answering the query?
        Answer with just YES or NO.
        """
        
        response = openai.ChatCompletion.create(
            model="gpt-3.5-turbo",  # Use cheaper model for filtering
            messages=[{"role": "user", "content": prompt}],
            max_tokens=10
        )
        
        return "YES" in response.choices[0].message.content.upper()
    
    def answer_question(self, question: str) -> Dict:
        """Answer question with sources"""
        
        # Search
        results = self.search(question, num_results=10)
        
        if not results:
            return {
                'answer': "I couldn't find relevant information to answer this.",
                'sources': []
            }
        
        # Filter relevant results
        relevant = [r for r in results if self.is_relevant(r, question)][:3]
        
        # Build context
        context = "\n\n".join([
            f"Source {i+1}: {r['title']}\n{r['snippet']}\nURL: {r['link']}"
            for i, r in enumerate(relevant)
        ])
        
        # Generate answer
        prompt = f"""
        Using the following sources, answer the question.
        Cite sources using [1], [2], [3] notation.
        If the sources don't contain the answer, say so.
        
        Sources:
        {context}
        
        Question: {question}
        
        Answer:"""
        
        response = openai.ChatCompletion.create(
            model="gpt-4",
            messages=[{"role": "user", "content": prompt}],
            temperature=0.3  # Lower temperature for factual responses
        )
        
        return {
            'answer': response.choices[0].message.content,
            'sources': [{'title': r['title'], 'url': r['link']} 
                       for r in relevant]
        }

# Usage
agent = AISearchAgent(
    search_api_key="your_searchcans_key",
    openai_api_key="your_openai_key"
)

result = agent.answer_question("Who won the 2024 World Series?")
print(result['answer'])
print("\nSources:")
for source in result['sources']:
    print(f"- {source['title']}: {source['url']}")

This version:

  • Filters irrelevant results using a cheap LLM call
  • Provides source attribution
  • Handles errors gracefully
  • Uses appropriate models for different tasks

LangChain Integration

If you’re using LangChain (and you probably should be), integration is even simpler:

from langchain.tools import Tool
from langchain.agents import initialize_agent, AgentType
from langchain.llms import OpenAI
import requests

def search_tool_func(query: str) -> str:
    """Search function for LangChain"""
    response = requests.get(
        'https://www.searchcans.com/api/search',
        headers={'Authorization': f'Bearer {API_KEY}'},
        params={'q': query, 'engine': 'google', 'num': 10}
    )
    
    results = response.json().get('organic_results', [])[:3]
    
    return "\n\n".join([
        f"{r['title']}: {r['snippet']}"
        for r in results
    ])

# Create the tool
search_tool = Tool(
    name="Web Search",
    func=search_tool_func,
    description="Useful for finding current information on the internet"
)

# Initialize agent
llm = OpenAI(temperature=0)
agent = initialize_agent(
    [search_tool],
    llm,
    agent=AgentType.ZERO_SHOT_REACT_DESCRIPTION,
    verbose=True
)

# Use it
result = agent.run("What are the latest developments in AI regulation?")
print(result)

LangChain handles the decision-making: when to search, how to interpret results, and when to respond.

Real-World Use Cases

1. Customer Support Bots

A SaaS company built a support bot that:

  • Searches their documentation for answers
  • Falls back to web search for general questions
  • Escalates to humans when unsure

Results: 60% of tickets handled automatically, 4-minute average resolution time.

2. Research Assistants

Academic researchers use AI + search to:

  • Find relevant papers quickly
  • Summarize current research on topics
  • Identify experts and institutions

One research team cut literature review time from 2 weeks to 3 days using this approach.

3. News Monitoring

A hedge fund built an AI agent that:

  • Monitors news for portfolio companies
  • Summarizes key developments
  • Alerts on significant events

The system processes 10,000+ articles daily and sends 5-10 high-priority alerts.

Cost Analysis: What It Actually Costs

Let’s calculate the economics for 10,000 user queries/month:

Search API Costs:

  • 10,000 queries × 1 search each = 10,000 API calls
  • At $0.50/1K = $5/month

OpenAI Costs:

  • Relevance filtering: 10,000 × 3 results × $0.001 = $30/month
  • Answer generation: 10,000 × $0.03 = $300/month
  • Total OpenAI: $330/month

Total: $335/month for 10,000 enhanced queries

Compare to maintaining your own real-time knowledge base:

  • Web crawler infrastructure: $500/month
  • Database: $200/month
  • Maintenance: 20 hours × $100 = $2,000/month
  • Total: $2,700/month

The API approach costs 12% of DIY and requires zero maintenance.

Advanced Patterns

Multi-Step Research

For complex questions, use multiple searches:

def deep_research(question: str) -> str:
    """Multi-step research for complex questions"""
    
    # Step 1: Identify sub-questions
    sub_questions = decompose_question(question)
    
    # Step 2: Research each sub-question
    sub_answers = []
    for sq in sub_questions:
        results = search(sq)
        answer = synthesize_answer(sq, results)
        sub_answers.append(answer)
    
    # Step 3: Combine into final answer
    final_answer = synthesize_final(question, sub_answers)
    
    return final_answer

This handles questions like “Compare the economic policies of the last three US presidents” that require multiple research steps.

Source Quality Scoring

Not all search results are equally trustworthy:

def score_source_quality(url: str) -> float:
    """Score source reliability"""
    
    trusted_domains = {
        'nytimes.com': 0.95,
        'wikipedia.org': 0.85,
        'reuters.com': 0.95,
        'nature.com': 0.98,
        # ... more domains
    }
    
    domain = url.split('/')[2]
    return trusted_domains.get(domain, 0.5)  # Default: medium trust

# Use in context building
weighted_results = sorted(
    results,
    key=lambda r: score_source_quality(r['link']),
    reverse=True
)

This prioritizes authoritative sources in the context sent to the LLM.

The Prompt Engineering That Matters

The quality of your prompts determines output quality. Here’s what works:

Bad Prompt:

Answer: {question}
Context: {context}

Good Prompt:

You are a helpful assistant that answers questions accurately.
Use ONLY the provided context to answer.
If the context doesn't contain the answer, say "I don't have enough information."
Cite sources using [1], [2] notation.

Context:
{context}

Question: {question}

Answer:

The good prompt:

  • Sets clear role expectations
  • Constrains answers to provided context
  • Prevents hallucination
  • Requests source attribution

Common Pitfalls

Pitfall 1: Sending too much context

GPT-4 has an 8K token limit. If you send 10 full articles as context, you’ll hit the limit and truncate important information.

Solution: Send only snippets and titles. Let the LLM decide if it needs more detail.

Pitfall 2: Not handling search failures

Search APIs can fail or return no results. Your code needs graceful degradation.

Solution: Always check for empty results and have a fallback response.

Pitfall 3: Trusting everything

LLMs will confidently present information even if it’s wrong.

Solution: Always show sources so users can verify claims.

The Future: Autonomous Agents

The next evolution is AI agents that:

  • Decide when to search
  • Choose what to search for
  • Determine if they need more information
  • Execute multi-step plans

We’re building agents using the ReAct pattern:

Thought: I need to find recent news about AI regulation
Action: Search for "AI regulation 2024"
Observation: [search results]
Thought: I found relevant information, let me synthesize it
Action: Respond to user

These agents are getting remarkably good at research tasks that used to require human analysts.

Getting Started Today

If you want to build AI applications with search:

Day 1: Build the basic RAG system (50 lines of code)

Day 2: Add relevance filtering and error handling

Day 3: Implement source attribution

Day 4: Test with real user queries and iterate

Don’t overthink it. The basic pattern works well. You can optimize later.


About the Author: Dr. Emily Zhang completed her PhD in NLP at Stanford and worked on LLM research at Google Brain. She now consults with companies building AI-powered products.

AI Development:

Implementation:

Get Started:

Want to add search capabilities to your AI application? Start with 100 free credits to build your first prototype.

Sarah Wang

Sarah Wang

AI Integration Specialist

Seattle, WA

Software engineer with focus on LLM integration and AI applications. 6+ years experience building AI-powered products and developer tools.

AI/MLLLM IntegrationRAG Systems
View all →

Trending articles will be displayed here.

Ready to try SearchCans?

Get 100 free credits and start using our SERP API today. No credit card required.