https://github.com/rahulsdevloper/gosearch-search-engine-scraper
Get search results from google, bing, duckduckgo, etc easily using GoSearch
https://github.com/rahulsdevloper/gosearch-search-engine-scraper
golang golang-library golang-module golang-package golang-scraper golang-search golang-searcher scraper scraper-engine search search-engine search-engine-scraper
Last synced: about 1 month ago
JSON representation
Get search results from google, bing, duckduckgo, etc easily using GoSearch
- Host: GitHub
- URL: https://github.com/rahulsdevloper/gosearch-search-engine-scraper
- Owner: RahulSDevloper
- Created: 2025-03-22T17:07:01.000Z (about 1 month ago)
- Default Branch: main
- Last Pushed: 2025-03-23T06:01:33.000Z (about 1 month ago)
- Last Synced: 2025-03-23T06:27:40.375Z (about 1 month ago)
- Topics: golang, golang-library, golang-module, golang-package, golang-scraper, golang-search, golang-searcher, scraper, scraper-engine, search, search-engine, search-engine-scraper
- Language: Go
- Homepage:
- Size: 8.79 KB
- Stars: 0
- Watchers: 1
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
# 🔍 Search Engine Scraper - GoSearch

```
_____ _ _____ _
/ ____| | | | ___| (_)
| (___ ___ __ _ _ __ ___| |__ | |__ _ __ __ _ _ _ __ ___
\___ \ / _ \/ _` | '__/ __| '_ \| __| '_ \ / _` || | '_ \ / _ \
____) | __/ (_| | | | (__| | | | |__| | | | (_| || | | | | __/
|_____/ \___|\__,_|_| \___|_| |_\____/_| |_|\__, ||_|_| |_|\___|
_____ __/ |
/ ____| |___/
| (___ ___ _ __ __ _ _ __ ___ _ __
\___ \ / __| '__/ _` | '_ \ / _ \ '__|
____) | (__| | | (_| | |_) | __/ |
|_____/ \___|_| \__,_| .__/ \___|_|
| |
|_|
```
![]()
![]()
High-performance, anti-detection search engine scraper - Built with advanced Go concurrency patterns
✨ Features •
🚀 Install •
🔧 Usage •
🌟 Examples •
🧠 Advanced •
🐞 Debug---
## ✨ Key Features
![]()
Google, Bing & DuckDuckGo
![]()
Bypass CAPTCHAs & Blocks
![]()
Chrome-Based Scraping
![]()
Domain, Keyword & More
![]()
Keyword Extraction & Ad Detection
![]()
Avoid Rate Limiting## 🚀 Installation
MethodCommands
```bash
# Download the latest release
curl -sSL https://github.com/RahulSDevloper/Search-Engine-Scraper---Golang/releases/download/v1.0.0/gosearch-linux-amd64 -o gosearch
chmod +x gosearch
./gosearch --query "golang programming"
```
```bash
git clone https://github.com/RahulSDevloper/Search-Engine-Scraper---Golang.git
cd Search-Engine-Scraper---Golang
go build -ldflags="-s -w" -o gosearch
./gosearch --query "golang programming"
```
```bash
docker pull rahulsdevloper/gosearch:latest
docker run rahulsdevloper/gosearch --query "golang programming"
```## 🔧 Usage
```
Usage: gosearch [OPTIONS] [QUERY]Options:
--query string Search query
--engine string Search engine (google, bing, duckduckgo, all) (default "google")
--max int Maximum results to fetch (default 10)
--ads Include advertisements in results
--timeout duration Search timeout (default 30s)
--proxy string Proxy URL (e.g., http://user:pass@host:port)
--headless Use headless browser (recommended for avoiding detection)
--lang string Language code (default "en")
--region string Region code (default "us")
--format string Output format (json, csv, table) (default "json")
--output string Output file (default: stdout)
--page int Result page number (default 1)
--min-words int Minimum word count in description
--max-words int Maximum word count in description
--domain string Filter results by domain (include)
--exclude-domain string Filter results by domain (exclude)
--keyword string Filter results by keyword
--type string Filter by result type (organic, special, etc.)
--site string Limit results to specific site
--filetype string Limit results to specific file type
--verbose Enable verbose logging
--debug Enable debug mode (saves HTML responses)
--log string Log file path
--stats string Statistics output file
--help Show help
```## 🌟 Examples
Basic Search with Google 🔍
```bash
./gosearch --query "golang programming"
```
Search with Advanced Filters 🧰
```bash
./gosearch --query "machine learning" --engine bing --domain edu --format table
```
Multi-Engine Search with Headless Browser 🌐
```bash
./gosearch --query "climate science" --engine all --headless --output results.json
```
Filetype Specific Search 📄
```bash
./gosearch --query "research papers" --filetype pdf --site edu --max 20
```## 🧠 Advanced Techniques
![]()
### Using as a Library
```go
package mainimport (
"context"
"fmt"
"time"
"github.com/RahulSDevloper/Search-Engine-Scraper---Golang/pkg/engines"
"github.com/RahulSDevloper/Search-Engine-Scraper---Golang/pkg/models"
)func main() {
// Create a new Google search engine
engine := engines.NewGoogleSearchEngine()
// Configure search request with optimization strategy
request := models.SearchRequest{
Query: "golang concurrency patterns",
MaxResults: 10,
Timeout: 30 * time.Second,
UseHeadless: true,
Debug: true,
}
// Execute search with context for cancellation
ctx, cancel := context.WithTimeout(context.Background(), 45*time.Second)
defer cancel()
results, err := engine.Search(ctx, request)
if err != nil {
fmt.Printf("Error: %v\n", err)
return
}
// Process and analyze results
for i, result := range results {
fmt.Printf("%d. %s\n%s\n\n", i+1, result.Title, result.URL)
}
}
```### Custom Rate Limiting
```yaml
# ~/.config/gosearch/config.yaml
rate_limits:
google: 10 # requests per minute
bing: 15
duckduckgo: 20proxy_rotation:
enabled: true
proxies:
- http://proxy1:8080
- http://proxy2:8080
rotation_strategy: round-robin # or random
```## 🐞 Debugging
![]()
### No Results Found?
If you're not getting any results, try these solutions:
1. **Use Headless Mode** to avoid detection
```bash
./gosearch --query "your search" --headless
```2. **Use a Proxy** to route through a clean IP address
```bash
./gosearch --query "your search" --proxy http://your-proxy-server:port
```3. **Enable Debug Mode** to examine the HTML response
```bash
./gosearch --query "your search" --debug
```### Debugging Process Flow
```mermaid
graph TD
A[Run Search] --> B{Results Found?}
B -->|Yes| C[Process Results]
B -->|No| D[Enable Debug Mode]
D --> E[Check HTML Responses]
E --> F{Captcha Present?}
F -->|Yes| G[Use Headless + Proxy]
F -->|No| H[Check Selectors]
H --> I[Update Selectors]
I --> A
G --> A
```## 📊 Performance Benchmarks
EngineResults/SecondMemory UsageDetection Avoidance
Google6.5LowHigh
Bing8.2LowMedium
DuckDuckGo7.3LowVery High
All (Concurrent)4.8MediumMedium## 📚 Design Philosophy
The Search Engine Scraper follows these core principles:
1. **Resilience First**: Designed to handle the constantly changing DOM structures of search engines
2. **Performance Focused**: Optimized for speed while maintaining low resource usage
3. **Privacy Conscious**: Minimal footprint to avoid detection
4. **Developer Friendly**: Clean API for integration into other Go applications## 📝 License
This project is licensed under the MIT License - see the [LICENSE](LICENSE) file for details.
---
⭐ Star this project if you find it useful! ⭐