Python is the king of data science, but when it comes to high-concurrency web scraping, Golang is the undisputed champion.
If you are building an enterprise-grade SEO monitor or a real-time AI news aggregator, Python’s Global Interpreter Lock (GIL) becomes a bottleneck. You need the raw power of Go’s Goroutines to handle thousands of requests per second.
In this guide, we’ll explore the state of scraping in Go (Colly, Chromedp) and why outsourcing the “heavy lifting” to an API like SearchCans is the secret to building scalable scrapers.
The State of Go Scraping: Fast but Painful
Golang offers excellent libraries like Colly and Goquery for parsing HTML. They are blazingly fast compared to BeautifulSoup.
However, the problem arises when you need to scrape dynamic sites like Google Search:
- No Native Headless Browser: Unlike Python’s Playwright, Go’s ecosystem (e.g., Chromedp) is less mature and harder to debug.
- IP Blocking at Scale: Go is too fast. If you unleash 1,000 Goroutines on Google without a massive proxy pool, you will burn your IP reputation in seconds.
The Architecture: Golang Concurrency + SearchCans API
The best architecture for 2026 is Hybrid:
Use SearchCans
To handle the dirty work (Headless Chrome, Captchas, Proxies).
Use Golang
To handle the concurrency and data processing.
This setup allows you to utilize SearchCans’ Unlimited Concurrency feature. You can fire off requests as fast as Go can handle them.
Code Example: Building a Concurrent Rank Checker
Let’s build a script that checks rankings for 100 keywords in parallel.
package main
import (
"bytes"
"encoding/json"
"fmt"
"io/ioutil"
"net/http"
"sync"
)
const apiKey = "YOUR_SEARCHCANS_KEY"
const apiEndpoint = "https://www.searchcans.com/api/search"
type Payload struct {
S string `json:"s"` // Search query
T string `json:"t"` // Engine
D int `json:"d"` // Number of results
}
type Response struct {
Code int `json:"code"`
Data []struct {
Title string `json:"title"`
URL string `json:"url"`
Rank int `json:"rank"`
} `json:"data"`
}
func fetchRank(keyword string, wg *sync.WaitGroup) {
defer wg.Done()
payload := Payload{S: keyword, T: "google", D: 100}
jsonPayload, _ := json.Marshal(payload)
req, _ := http.NewRequest("POST", apiEndpoint, bytes.NewBuffer(jsonPayload))
req.Header.Set("Authorization", "Bearer "+apiKey)
req.Header.Set("Content-Type", "application/json")
client := &http.Client{}
resp, err := client.Do(req)
if err != nil {
fmt.Printf("�?Failed: %s\n", keyword)
return
}
defer resp.Body.Close()
body, _ := ioutil.ReadAll(resp.Body)
var result Response
json.Unmarshal(body, &result)
fmt.Printf("�?Scraped: %s (%d results)\n", keyword, len(result.Data))
}
func main() {
keywords := []string{
"best serp api",
"golang scraper",
"searchcans pricing",
"rag pipeline",
// Add hundreds more...
}
var wg sync.WaitGroup
fmt.Println("🚀 Starting concurrent scrape...")
for _, k := range keywords {
wg.Add(1)
// Launch a Goroutine for every keyword
go fetchRank(k, &wg)
}
wg.Wait()
fmt.Println("🎉 All done!")
}
Advanced: Rate Limiting and Error Handling
While SearchCans offers unlimited concurrency, you may want to add some control:
import (
"context"
"golang.org/x/time/rate"
"time"
)
func fetchRankWithRateLimit(keyword string, limiter *rate.Limiter, wg *sync.WaitGroup) {
defer wg.Done()
// Wait for rate limiter
ctx := context.Background()
err := limiter.Wait(ctx)
if err != nil {
fmt.Printf("Rate limit error: %v\n", err)
return
}
// Make request with timeout
ctx, cancel := context.WithTimeout(context.Background(), 30*time.Second)
defer cancel()
// ... rest of the request code
}
func main() {
// Allow 100 requests per second
limiter := rate.NewLimiter(100, 100)
var wg sync.WaitGroup
for _, keyword := range keywords {
wg.Add(1)
go fetchRankWithRateLimit(keyword, limiter, &wg)
}
wg.Wait()
}
Building a Production Scraper
1. Worker Pool Pattern
For better resource management:
type Job struct {
Keyword string
}
type Result struct {
Keyword string
Data []SearchResult
Error error
}
func worker(id int, jobs <-chan Job, results chan<- Result) {
for job := range jobs {
fmt.Printf("Worker %d processing %s\n", id, job.Keyword)
// Make API call
data, err := fetchData(job.Keyword)
results <- Result{
Keyword: job.Keyword,
Data: data,
Error: err,
}
}
}
func main() {
numWorkers := 50
jobs := make(chan Job, 100)
results := make(chan Result, 100)
// Start workers
for w := 1; w <= numWorkers; w++ {
go worker(w, jobs, results)
}
// Send jobs
go func() {
for _, keyword := range keywords {
jobs <- Job{Keyword: keyword}
}
close(jobs)
}()
// Collect results
for range keywords {
result := <-results
if result.Error != nil {
fmt.Printf("Error for %s: %v\n", result.Keyword, result.Error)
} else {
processData(result.Data)
}
}
}
2. Retry Logic
func fetchWithRetry(keyword string, maxRetries int) (*Response, error) {
var lastErr error
for i := 0; i < maxRetries; i++ {
resp, err := fetch(keyword)
if err == nil {
return resp, nil
}
lastErr = err
backoff := time.Duration(i+1) * time.Second
time.Sleep(backoff)
}
return nil, fmt.Errorf("failed after %d retries: %v", maxRetries, lastErr)
}
3. Result Aggregation
type Stats struct {
TotalRequests int
SuccessfulReqs int
FailedReqs int
AverageDuration time.Duration
mu sync.Mutex
}
func (s *Stats) RecordSuccess(duration time.Duration) {
s.mu.Lock()
defer s.mu.Unlock()
s.TotalRequests++
s.SuccessfulReqs++
// Update average duration
}
func (s *Stats) RecordFailure() {
s.mu.Lock()
defer s.mu.Unlock()
s.TotalRequests++
s.FailedReqs++
}
func (s *Stats) GetReport() string {
s.mu.Lock()
defer s.mu.Unlock()
return fmt.Sprintf("Total: %d, Success: %d, Failed: %d, Success Rate: %.2f%%",
s.TotalRequests,
s.SuccessfulReqs,
s.FailedReqs,
float64(s.SuccessfulReqs)/float64(s.TotalRequests)*100,
)
}
Performance Comparison
| Feature | Colly / Chromedp | SearchCans API |
|---|---|---|
| Concurrency | Limited by CPU/Memory | Unlimited (Cloud Scale) |
| JavaScript | Hard to render | Pre-rendered |
| Anti-Bot | Must build manually | Included |
| Cost | Proxy maintenance costs | $0.56 / 1k requests |
| Setup Time | Days | Minutes |
For a detailed cost analysis, see our pricing comparison.
Real-World Benchmarks
Testing with 1,000 keywords:
Python (Sequential): 45 minutes
Python (Threading): 12 minutes
Golang + SearchCans: 23 seconds
Integration with Data Storage
PostgreSQL
import (
"database/sql"
_ "github.com/lib/pq"
)
func storeResults(db *sql.DB, keyword string, results []SearchResult) error {
tx, err := db.Begin()
if err != nil {
return err
}
defer tx.Rollback()
for _, result := range results {
_, err := tx.Exec(`
INSERT INTO search_results (keyword, title, url, rank, date)
VALUES ($1, $2, $3, $4, NOW())
`, keyword, result.Title, result.URL, result.Rank)
if err != nil {
return err
}
}
return tx.Commit()
}
Redis Cache
import "github.com/go-redis/redis/v8"
func cacheResults(rdb *redis.Client, keyword string, results []SearchResult) error {
ctx := context.Background()
data, err := json.Marshal(results)
if err != nil {
return err
}
// Cache for 1 hour
return rdb.Set(ctx, "serp:"+keyword, data, time.Hour).Err()
}
Building a REST API
Expose your scraper as a service:
import (
"github.com/gin-gonic/gin"
)
func main() {
r := gin.Default()
r.POST("/api/search", func(c *gin.Context) {
var req struct {
Keywords []string `json:"keywords"`
}
if err := c.BindJSON(&req); err != nil {
c.JSON(400, gin.H{"error": err.Error()})
return
}
results := make(map[string]interface{})
var wg sync.WaitGroup
mu := sync.Mutex{}
for _, keyword := range req.Keywords {
wg.Add(1)
go func(kw string) {
defer wg.Done()
data, _ := fetchData(kw)
mu.Lock()
results[kw] = data
mu.Unlock()
}(keyword)
}
wg.Wait()
c.JSON(200, results)
})
r.Run(":8080")
}
Monitoring and Observability
import (
"github.com/prometheus/client_golang/prometheus"
"github.com/prometheus/client_golang/prometheus/promauto"
)
var (
requestsTotal = promauto.NewCounter(prometheus.CounterOpts{
Name: "scraper_requests_total",
Help: "Total number of scraping requests",
})
requestDuration = promauto.NewHistogram(prometheus.HistogramOpts{
Name: "scraper_request_duration_seconds",
Help: "Request duration in seconds",
})
)
func scrapeWithMetrics(keyword string) {
timer := prometheus.NewTimer(requestDuration)
defer timer.ObserveDuration()
requestsTotal.Inc()
// Perform scraping
fetchData(keyword)
}
For more on building scalable systems, see our guide on AI agent scaling.
Conclusion
Don’t waste Go’s potential on managing headless browsers. Use Go for what it’s best at—concurrency and throughput—and let SearchCans provide the data pipeline.
Scale your Golang scraper today. For other language guides, check out our Python and Node.js tutorials, or explore our full documentation.