SearchCans

LangChain Google Search Integration: Complete AI Agent Tutorial

Learn how to build an AI Research Agent using LangChain and SearchCans. A step-by-step Python guide to bypassing Google CSE limits and enabling real-time web search.

6 min read

Large Language Models (LLMs) like GPT-4 are powerful, but they have a fatal flaw: they are frozen in time. As noted by AI researchers, LLMs cannot answer questions about current events or verify facts beyond their training data.

To build a truly intelligent AI Research Agent, you need to give it “eyes”—access to the real-time internet.

LangChain provides a framework to connect LLMs with external tools. In this guide, we will look at how to replace the restrictive default Google Search tool with a scalable, high-performance custom tool using the SearchCans API.

The Default Way: GoogleSearchAPIWrapper (And Its Limits)

LangChain has a built-in utility called GoogleSearchAPIWrapper. While useful for prototyping, it comes with significant friction for production apps.

To use it, you must configure two distinct credentials from the Google Cloud Console:

  1. GOOGLE_API_KEY
  2. GOOGLE_CSE_ID (Custom Search Engine ID)

The Bottleneck:

The standard Google Custom Search API provides only 100 free search queries per day. For an autonomous Agent that might run 10-20 searches to answer a single complex user query, you will hit this limit in minutes. Scaling beyond this becomes prohibitively expensive ($5/1k requests).

For developers building production AI agents, this is simply not viable.

The Scalable Way: Building a Custom Tool with SearchCans

To build a robust agent, we need an API that offers unlimited concurrency and low costs ($0.56/1k). We will create a custom LangChain tool that connects to SearchCans.

Prerequisites

  • Python 3.10+
  • langchain and openai installed
  • A SearchCans API Key (Get it free at searchcans.com)

Step 1: Define the Search Function

First, we create a simple Python function that fetches data from SearchCans. Unlike the default wrapper, this gives us structured JSON data including snippets, titles, and links.

import requests
import json
from langchain.tools import Tool

SEARCHCANS_API_KEY = "YOUR_KEY_HERE"

def searchcans_search(query: str) -> str:
    """Searches Google using SearchCans API and returns relevant snippets."""
    url = "https://www.searchcans.com/api/search"
    payload = {
        "s": query,
        "t": "google",
        "d": 5,        # Retrieve top 5 results
        "p": 1
    }
    headers = {
        "Authorization": f"Bearer {SEARCHCANS_API_KEY}",
        "Content-Type": "application/json"
    }
    
    try:
        response = requests.post(url, json=payload, headers=headers, timeout=10)
        data = response.json()
        
        if data.get("code") == 0:
            results = data.get("data", [])
            # Format the output for the LLM
            formatted_results = []
            for item in results:
                formatted_results.append(f"Title: {item['title']}\nLink: {item['url']}\nSnippet: {item.get('description', 'No snippet')}")
            return "\n---\n".join(formatted_results)
        else:
            return f"Search Error: {data.get('msg')}"
    except Exception as e:
        return f"Connection Failed: {e}"

Step 2: Wrap it as a LangChain Tool

LangChain agents use Tool objects to understand what actions they can take. We define our tool with a clear description so the LLM knows when to use it.

# Define the Custom Tool
search_tool = Tool(
    name="Current_Web_Search",
    func=searchcans_search,
    description="Useful for when you need to answer questions about current events, recent news, or facts that might have changed after your training data."
)

This approach aligns with LangChain’s flexible design, which supports various search integrations like Bing, DuckDuckGo, and Google Serper.

Step 3: Initialize the Agent

Now, let’s spin up an Agent using OpenAI’s GPT-4o. The Agent will automatically decide when to call our Current_Web_Search tool.

from langchain.chat_models import ChatOpenAI
from langchain.agents import initialize_agent, AgentType

llm = ChatOpenAI(temperature=0, model="gpt-4o")

# Initialize the agent with our custom tool
agent = initialize_agent(
    tools=[search_tool],
    llm=llm,
    agent=AgentType.ZERO_SHOT_REACT_DESCRIPTION,  # The standard "ReAct" agent
    verbose=True
)

# Run a query that requires internet access
query = "What is the latest pricing for GPT-4o API compared to Claude 3.5 Sonnet?"
agent.run(query)

Why This Architecture Wins

By swapping the default Google wrapper for SearchCans, your Agent gains significant advantages:

  1. Cost Efficiency: You are paying $0.56 per 1,000 searches instead of $5.00+. This 10x savings allows you to build “chatty” agents that search extensively without breaking the bank. See our detailed pricing comparison for more details.
  2. No Rate Limits: Unlike the 100/day limit of the free tier, SearchCans supports unlimited concurrency, perfect for multi-user production apps.
  3. Structured Data: You get clean JSON, which is easier for the LLM to parse than raw HTML scraping.

Advanced: Multi-Tool Agent

You can combine multiple tools for more sophisticated agents:

from langchain.tools import Tool

# Create additional tools
def calculator(query: str) -> str:
    # Simple calculator implementation
    try:
        return str(eval(query))
    except:
        return "Invalid calculation"

calc_tool = Tool(
    name="Calculator",
    func=calculator,
    description="Useful for mathematical calculations"
)

# Initialize agent with multiple tools
multi_agent = initialize_agent(
    tools=[search_tool, calc_tool],
    llm=llm,
    agent=AgentType.ZERO_SHOT_REACT_DESCRIPTION,
    verbose=True
)

# The agent can now both search and calculate
multi_agent.run("What is the market cap of Tesla, and what is 10% of that value?")

Integration with RAG Pipelines

For building RAG applications, you can combine search with content extraction:

def search_and_extract(query: str) -> str:
    """Search and extract full content from top results"""
    # First, search
    search_results = searchcans_search(query)
    
    # Then extract content from top URLs using Reader API
    # (Implementation details in our RAG guide)
    
    return enriched_content

For more on optimizing your RAG pipeline, see our guide on web to markdown conversion.

Monitoring and Debugging

Enable verbose mode to see the agent’s reasoning:

agent = initialize_agent(
    tools=[search_tool],
    llm=llm,
    agent=AgentType.ZERO_SHOT_REACT_DESCRIPTION,
    verbose=True,  # Shows the agent's thought process
    max_iterations=5  # Prevent infinite loops
)

Conclusion

Building an AI Agent is easy; building a useful one requires reliable access to the outside world. By integrating SearchCans with LangChain, you provide your agent with a high-performance, low-cost window to the web.

Ready to upgrade your LangChain Agent? For more implementation patterns, check out our advanced prompt engineering guide or explore our complete documentation.

👉 Get your SearchCans API Key here

David Chen

David Chen

Senior Backend Engineer

San Francisco, CA

8+ years in API development and search infrastructure. Previously worked on data pipeline systems at tech companies. Specializes in high-performance API design.

API DevelopmentSearch TechnologySystem Architecture
View all →

Trending articles will be displayed here.

Ready to try SearchCans?

Get 100 free credits and start using our SERP API today. No credit card required.