Search APIs Powering AI: From RAG to AI Agents

When I was at Google working on LLM research, we had one major problem: language models only know what they were trained on. Ask GPT-4 about something that happened yesterday, and it’s clueless.

The solution everyone’s building now? Connect LLMs to search APIs for real-time information.

This post explains how it works and shows you the exact code to build it yourself.

Practical Guides: Build AI Agent Tutorial | Advanced Integration | API Docs

The Knowledge Cut-off Problem

Every LLM has a training cut-off date. GPT-4’s knowledge ends in April 2023 (for the base model). Ask it:

“What’s the current weather in San Francisco?”

It can’t answer. The information doesn’t exist in its training data.

Traditional solutions were clunky:

Fine-tune the model (expensive, slow)
Maintain a massive up-to-date knowledge base (engineering nightmare)
Accept the limitations (bad user experience)

The Better Solution: Retrieval Augmented Generation (RAG)

RAG is a fancy term for a simple idea:

User asks a question
Search for relevant information
Feed that information to the LLM as context
LLM generates answer using both its training AND the fresh data

Search APIs make step #2 trivial.

The Basic Architecture

Here’s the simplest implementation:

import requests
import openai

def answer_with_search(question):
    """Answer questions using LLM + search API"""
    
    # Step 1: Search for relevant information
    search_response = requests.get(
        'https://www.searchcans.com/api/search',
        headers={'Authorization': f'Bearer {SEARCHCANS_KEY}'},
        params={'q': question, 'engine': 'google', 'num': 5}
    )
    
    search_results = search_response.json()
    
    # Step 2: Extract relevant snippets
    context = "\n\n".join([
        f"{result['title']}\n{result['snippet']}"
        for result in search_results.get('organic_results', [])[:3]
    ])
    
    # Step 3: Feed to LLM with context
    prompt = f"""
    Answer the following question using the provided context.
    
    Context from recent search:
    {context}
    
    Question: {question}
    
    Answer:"""
    
    response = openai.ChatCompletion.create(
        model="gpt-4",
        messages=[{"role": "user", "content": prompt}]
    )
    
    return response.choices[0].message.content

# Test it
answer = answer_with_search("What happened at OpenAI in November 2024?")
print(answer)

This works surprisingly well. The LLM now has access to current information from the search results.

Production-Grade Implementation

The basic version has issues:

No error handling
Wastes tokens on irrelevant context
No source attribution
Can’t handle complex queries

Here’s how we build it properly:

import requests
import openai
from typing import List, Dict

class AISearchAgent:
    def __init__(self, search_api_key, openai_api_key):
        self.search_key = search_api_key
        self.openai_key = openai_api_key
        openai.api_key = openai_api_key
    
    def search(self, query: str, num_results: int = 5) -> List[Dict]:
        """Execute search and return structured results"""
        try:
            response = requests.get(
                'https://www.searchcans.com/api/search',
                headers={'Authorization': f'Bearer {self.search_key}'},
                params={'q': query, 'engine': 'google', 'num': num_results},
                timeout=10
            )
            response.raise_for_status()
            return response.json().get('organic_results', [])
        except Exception as e:
            print(f"Search failed: {e}")
            return []
    
    def is_relevant(self, result: Dict, original_query: str) -> bool:
        """Use LLM to check if search result is relevant"""
        prompt = f"""
        Query: {original_query}
        Result Title: {result['title']}
        Result Snippet: {result['snippet']}
        
        Is this result relevant to answering the query?
        Answer with just YES or NO.
        """
        
        response = openai.ChatCompletion.create(
            model="gpt-3.5-turbo",  # Use cheaper model for filtering
            messages=[{"role": "user", "content": prompt}],
            max_tokens=10
        )
        
        return "YES" in response.choices[0].message.content.upper()
    
    def answer_question(self, question: str) -> Dict:
        """Answer question with sources"""
        
        # Search
        results = self.search(question, num_results=10)
        
        if not results:
            return {
                'answer': "I couldn't find relevant information to answer this.",
                'sources': []
            }
        
        # Filter relevant results
        relevant = [r for r in results if self.is_relevant(r, question)][:3]
        
        # Build context
        context = "\n\n".join([
            f"Source {i+1}: {r['title']}\n{r['snippet']}\nURL: {r['link']}"
            for i, r in enumerate(relevant)
        ])
        
        # Generate answer
        prompt = f"""
        Using the following sources, answer the question.
        Cite sources using [1], [2], [3] notation.
        If the sources don't contain the answer, say so.
        
        Sources:
        {context}
        
        Question: {question}
        
        Answer:"""
        
        response = openai.ChatCompletion.create(
            model="gpt-4",
            messages=[{"role": "user", "content": prompt}],
            temperature=0.3  # Lower temperature for factual responses
        )
        
        return {
            'answer': response.choices[0].message.content,
            'sources': [{'title': r['title'], 'url': r['link']} 
                       for r in relevant]
        }

# Usage
agent = AISearchAgent(
    search_api_key="your_searchcans_key",
    openai_api_key="your_openai_key"
)

result = agent.answer_question("Who won the 2024 World Series?")
print(result['answer'])
print("\nSources:")
for source in result['sources']:
    print(f"- {source['title']}: {source['url']}")

This version:

Filters irrelevant results using a cheap LLM call
Provides source attribution
Handles errors gracefully
Uses appropriate models for different tasks

LangChain Integration

If you’re using LangChain (and you probably should be), integration is even simpler:

from langchain.tools import Tool
from langchain.agents import initialize_agent, AgentType
from langchain.llms import OpenAI
import requests

def search_tool_func(query: str) -> str:
    """Search function for LangChain"""
    response = requests.get(
        'https://www.searchcans.com/api/search',
        headers={'Authorization': f'Bearer {API_KEY}'},
        params={'q': query, 'engine': 'google', 'num': 10}
    )
    
    results = response.json().get('organic_results', [])[:3]
    
    return "\n\n".join([
        f"{r['title']}: {r['snippet']}"
        for r in results
    ])

# Create the tool
search_tool = Tool(
    name="Web Search",
    func=search_tool_func,
    description="Useful for finding current information on the internet"
)

# Initialize agent
llm = OpenAI(temperature=0)
agent = initialize_agent(
    [search_tool],
    llm,
    agent=AgentType.ZERO_SHOT_REACT_DESCRIPTION,
    verbose=True
)

# Use it
result = agent.run("What are the latest developments in AI regulation?")
print(result)

LangChain handles the decision-making: when to search, how to interpret results, and when to respond.

Real-World Use Cases

1. Customer Support Bots

A SaaS company built a support bot that:

Searches their documentation for answers
Falls back to web search for general questions
Escalates to humans when unsure

Results: 60% of tickets handled automatically, 4-minute average resolution time.

2. Research Assistants

Academic researchers use AI + search to:

Find relevant papers quickly
Summarize current research on topics
Identify experts and institutions

One research team cut literature review time from 2 weeks to 3 days using this approach.

3. News Monitoring

A hedge fund built an AI agent that:

Monitors news for portfolio companies
Summarizes key developments
Alerts on significant events

The system processes 10,000+ articles daily and sends 5-10 high-priority alerts.

Cost Analysis: What It Actually Costs

Let’s calculate the economics for 10,000 user queries/month:

Search API Costs:

10,000 queries × 1 search each = 10,000 API calls
At $0.50/1K = $5/month

OpenAI Costs:

Relevance filtering: 10,000 × 3 results × $0.001 = $30/month
Answer generation: 10,000 × $0.03 = $300/month
Total OpenAI: $330/month

Total: $335/month for 10,000 enhanced queries

Compare to maintaining your own real-time knowledge base:

Web crawler infrastructure: $500/month
Database: $200/month
Maintenance: 20 hours × $100 = $2,000/month
Total: $2,700/month

The API approach costs 12% of DIY and requires zero maintenance.

Advanced Patterns

Multi-Step Research

For complex questions, use multiple searches:

def deep_research(question: str) -> str:
    """Multi-step research for complex questions"""
    
    # Step 1: Identify sub-questions
    sub_questions = decompose_question(question)
    
    # Step 2: Research each sub-question
    sub_answers = []
    for sq in sub_questions:
        results = search(sq)
        answer = synthesize_answer(sq, results)
        sub_answers.append(answer)
    
    # Step 3: Combine into final answer
    final_answer = synthesize_final(question, sub_answers)
    
    return final_answer

This handles questions like “Compare the economic policies of the last three US presidents” that require multiple research steps.

Source Quality Scoring

Not all search results are equally trustworthy:

def score_source_quality(url: str) -> float:
    """Score source reliability"""
    
    trusted_domains = {
        'nytimes.com': 0.95,
        'wikipedia.org': 0.85,
        'reuters.com': 0.95,
        'nature.com': 0.98,
        # ... more domains
    }
    
    domain = url.split('/')[2]
    return trusted_domains.get(domain, 0.5)  # Default: medium trust

# Use in context building
weighted_results = sorted(
    results,
    key=lambda r: score_source_quality(r['link']),
    reverse=True
)

This prioritizes authoritative sources in the context sent to the LLM.

The Prompt Engineering That Matters

The quality of your prompts determines output quality. Here’s what works:

Bad Prompt:

Answer: {question}
Context: {context}

Good Prompt:

You are a helpful assistant that answers questions accurately.
Use ONLY the provided context to answer.
If the context doesn't contain the answer, say "I don't have enough information."
Cite sources using [1], [2] notation.

Context:
{context}

Question: {question}

Answer:

The good prompt:

Sets clear role expectations
Constrains answers to provided context
Prevents hallucination
Requests source attribution

Common Pitfalls

Pitfall 1: Sending too much context

GPT-4 has an 8K token limit. If you send 10 full articles as context, you’ll hit the limit and truncate important information.

Solution: Send only snippets and titles. Let the LLM decide if it needs more detail.

Pitfall 2: Not handling search failures

Search APIs can fail or return no results. Your code needs graceful degradation.

Solution: Always check for empty results and have a fallback response.

Pitfall 3: Trusting everything

LLMs will confidently present information even if it’s wrong.

Solution: Always show sources so users can verify claims.

The Future: Autonomous Agents

The next evolution is AI agents that:

Decide when to search
Choose what to search for
Determine if they need more information
Execute multi-step plans

We’re building agents using the ReAct pattern:

Thought: I need to find recent news about AI regulation
Action: Search for "AI regulation 2024"
Observation: [search results]
Thought: I found relevant information, let me synthesize it
Action: Respond to user

These agents are getting remarkably good at research tasks that used to require human analysts.

Getting Started Today

If you want to build AI applications with search:

Day 1: Build the basic RAG system (50 lines of code)

Day 2: Add relevance filtering and error handling

Day 3: Implement source attribution

Day 4: Test with real user queries and iterate

Don’t overthink it. The basic pattern works well. You can optimize later.

About the Author: Dr. Emily Zhang completed her PhD in NLP at Stanford and worked on LLM research at Google Brain. She now consults with companies building AI-powered products.

AI Development:

Build AI Agent with SERP API - Step-by-step tutorial
AI Agent Integration Guide - Advanced patterns
SERP API Documentation - Technical reference

Implementation:

Integration Best Practices - Production tips
What is SERP API? - Beginner’s guide
URL Content Extraction - Extract full content

Get Started:

Free registration - 100 credits
View pricing - From $0.33/1K
Try playground - Test instantly

Want to add search capabilities to your AI application? Start with 100 free credits to build your first prototype.

From RAG to AI Agents | Search APIs Powering AI

The Knowledge Cut-off Problem

The Better Solution: Retrieval Augmented Generation (RAG)

The Basic Architecture

Production-Grade Implementation

LangChain Integration

Real-World Use Cases

1. Customer Support Bots

2. Research Assistants

3. News Monitoring

Cost Analysis: What It Actually Costs

Advanced Patterns

Multi-Step Research

Source Quality Scoring

The Prompt Engineering That Matters

Common Pitfalls

The Future: Autonomous Agents

Getting Started Today

Essential Resources & Guides

API Documentation

Pricing Plans

API Playground

Get Started Free

Popular Tutorials & Guides

Ready to try SearchCans?

Explore More

Pricing Plans

API Playground

More Articles

The Knowledge Cut-off Problem

The Better Solution: Retrieval Augmented Generation (RAG)

The Basic Architecture

Production-Grade Implementation

LangChain Integration

Real-World Use Cases

1. Customer Support Bots

2. Research Assistants

3. News Monitoring

Cost Analysis: What It Actually Costs

Advanced Patterns

Multi-Step Research

Source Quality Scoring

The Prompt Engineering That Matters

Common Pitfalls

The Future: Autonomous Agents

Getting Started Today

Related Resources

Essential Resources & Guides

API Documentation

Pricing Plans

API Playground

Get Started Free

Popular Tutorials & Guides

Trending Articles

Ready to try SearchCans?

Explore More

Pricing Plans

API Playground

More Articles