Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/zahidhasann88/go-web-scrapper
A web scraping API using Golang with Gin and ChromeDP for dynamic site scraping.
https://github.com/zahidhasann88/go-web-scrapper
chromedp crawling gin golang scrapper scrapping
Last synced: about 21 hours ago
JSON representation
A web scraping API using Golang with Gin and ChromeDP for dynamic site scraping.
- Host: GitHub
- URL: https://github.com/zahidhasann88/go-web-scrapper
- Owner: zahidhasann88
- Created: 2024-07-12T12:07:40.000Z (6 months ago)
- Default Branch: main
- Last Pushed: 2024-07-12T13:49:35.000Z (6 months ago)
- Last Synced: 2024-11-06T01:50:10.414Z (about 2 months ago)
- Topics: chromedp, crawling, gin, golang, scrapper, scrapping
- Language: Go
- Homepage:
- Size: 14.6 KB
- Stars: 0
- Watchers: 2
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
# Go Web Scraper
This project demonstrates a web scraping API using Golang with Gin and ChromeDP for dynamic site scraping.
## Setup
### Prerequisites
- Go (version 1.16+ recommended)
- Git
- Chrome or Chromium browser (for ChromeDP scraper)### Installation
1. Clone the repository:
```bash
git clone https://github.com/zahidhasann88/go-web-scraper.git
cd go-web-scraper
2. Install dependencies:
```bash
go mod tidy3. Run the application
```bash
go run main.go
## Usage### API Endpoint
- **Endpoint:** `POST /scrape`
- **Description:** Scrapes a website using ChromeDP or Colly based on the `useChromedp` flag.### Example Request
- **POST** - http://localhost:8080/scrape
```json{
"url": "https://executivemachines.com",
"format": "json",
"filename": "scraped_data.json",
"useChromedp": true
}
```
# Technologies UsedThis project utilizes the following technologies:
- **Gin** - Web framework for building APIs in Golang.
- **ChromeDP** - Headless Chrome DevTools Protocol for browser automation and scraping.
- **Colly** - Golang-based web scraping framework for extracting data from websites.