Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/qharny/dart_web_scraper
A versatile command-line web scraper built with Dart. This tool allows you to scrape web pages and save the extracted data in various formats.
https://github.com/qharny/dart_web_scraper
dart scraper url web-scraping
Last synced: 2 months ago
JSON representation
A versatile command-line web scraper built with Dart. This tool allows you to scrape web pages and save the extracted data in various formats.
- Host: GitHub
- URL: https://github.com/qharny/dart_web_scraper
- Owner: Qharny
- Created: 2024-08-04T16:25:39.000Z (5 months ago)
- Default Branch: main
- Last Pushed: 2024-08-04T17:54:22.000Z (5 months ago)
- Last Synced: 2024-08-05T19:01:35.016Z (5 months ago)
- Topics: dart, scraper, url, web-scraping
- Language: Dart
- Homepage:
- Size: 7.82 MB
- Stars: 3
- Watchers: 2
- Forks: 0
- Open Issues: 1
-
Metadata Files:
- Readme: README.md
- Changelog: CHANGELOG.md
Awesome Lists containing this project
README
# Dart Web Scraper
A versatile command-line web scraper built with Dart. This tool allows you to scrape web pages and save the extracted data in various formats.
## Features
- Scrape paragraphs and links from any web page
- Save scraped data in TXT, CSV, or JSON format
- Command-line interface with interactive prompts
- Flexible output options (specify format and filename)
- Error handling for network issues and invalid inputs## Prerequisites
To run this project, you need to have Dart SDK installed on your system. If you haven't installed Dart yet, follow the [official Dart installation guide](https://dart.dev/get-dart).
## Installation
1. Clone this repository:
```
git clone https://github.com/Qharny/Dart_Web_Scraper.git
cd Dart_Web_Scraper
```2. Install dependencies:
```
dart pub get
```## Usage
You can run the web scraper using the following command:
```
dart run bin/main.dart [options]
```## Test
You can run the test using the following command:
```
dart run test
```### Options:
- `--url` or `-u`: Specify the URL to scrape
- `--format` or `-f`: Specify the output format (txt, csv, or json)
- `--output` or `-o`: Specify the output filenameIf you don't provide these options, the script will prompt you to enter them interactively.
### Examples:
1. Scrape a website and save as TXT (with interactive prompts):
```
dart run bin/main.dart
```2. Scrape a specific URL and save as CSV:
```
dart run bin/main.dart --url https://example.com --format csv
```3. Scrape a website, save as JSON with a custom filename:
```
dart run bin/main.dart --url https://example.com --format json --output my_data.json
```## Project Structure
- `bin/main.dart`: The main entry point of the application
- `lib/scraper.dart`: Contains the core scraping logic
- `lib/models/scraped_data.dart`: Defines the data model for scraped content## Contributing
Contributions are welcome! Please feel free to submit a Pull Request.
## License
## Acknowledgements
- [http](https://pub.dev/packages/http) package for making HTTP requests
- [html](https://pub.dev/packages/html) package for parsing HTML
- [args](https://pub.dev/packages/args) package for parsing command-line arguments
- [csv](https://pub.dev/packages/csv) package for CSV file handling