https://github.com/gotz1480/goscavenger

A simple webscraper written in Go
https://github.com/gotz1480/goscavenger

webscraper webscraping

Last synced: 6 months ago
JSON representation

A simple webscraper written in Go

Host: GitHub
URL: https://github.com/gotz1480/goscavenger
Owner: gotz1480
License: gpl-3.0
Created: 2023-12-19T04:56:05.000Z (almost 2 years ago)
Default Branch: main
Last Pushed: 2023-12-19T06:50:33.000Z (almost 2 years ago)
Last Synced: 2025-04-04T13:13:20.267Z (7 months ago)
Topics: webscraper, webscraping
Language: Go
Homepage:
Size: 19.5 KB
Stars: 0
Watchers: 1
Forks: 0
Open Issues: 0
Metadata Files:
- Readme: README.md
- License: LICENSE

Awesome Lists containing this project

README

          # GoScavenger

This repository contains a simple web scraper written in Go. The scraper is designed to connect to a specified website using HTTPS, retrieve HTML content, and extract specific data based on HTML class names.

## Features

- Connects to websites using HTTPS.

- Reads and processes HTTP response headers.

- Handles both fixed `Content-Length` and `Transfer-Encoding: chunked` responses.

- Extracts content from HTML based on class names, ID and HTML tags.

## Files in the Repository

- `main.go`: Contains the main function that drives the web scraping process.

- `scraper.go`: Includes the `FindStringInTag`, `FindContentByID` and `FindContentByClass` functions, which are used for parsing HTML and extracting content.

## Getting Started

To use this scraper, you need to have Go installed on your machine. [Download and install Go](https://golang.org/dl/) if you haven't already.

### Installation

Clone the repository to your local machine:

```bash

git clone https://github.com/araujo88/GoScavenger.git

cd GoScavenger

```

### Usage

1. Open `main.go`.

2. Modify the `server` variable to specify the website you want to scrape.

3. Optionally, adjust the request headers according to your requirements.

4. Run the scraper:

```bash

go run .

```

The output will be printed to the console.

## Contributing

Contributions to improve this simple web scraper are welcome. Feel free to fork the repository and submit pull requests.

## License

This project is licensed under the GPL License - see the [LICENSE](LICENSE) file for details.

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/gotz1480/goscavenger

Awesome Lists containing this project

README