https://github.com/gregpriday/laravel-scraper

Last synced: about 1 year ago
JSON representation

Host: GitHub
URL: https://github.com/gregpriday/laravel-scraper
Owner: gregpriday
License: mit
Created: 2024-06-22T05:34:17.000Z (almost 2 years ago)
Default Branch: main
Last Pushed: 2024-06-25T16:40:37.000Z (almost 2 years ago)
Last Synced: 2025-02-13T16:36:22.876Z (over 1 year ago)
Language: PHP
Size: 43.9 KB
Stars: 0
Watchers: 1
Forks: 0
Open Issues: 0
Metadata Files:
- Readme: README.md
- Changelog: CHANGELOG.md
- Funding: .github/FUNDING.yml
- License: LICENSE.md

Awesome Lists containing this project

README

          # Laravel Scraper

Laravel Scraper is a flexible and powerful web scraping package for Laravel applications. It provides a unified interface to interact with multiple scraping services, allowing you to easily switch between or combine different scraping APIs based on your needs.

## Installation

You can install the package via composer:

```bash

composer require gregpriday/laravel-scraper

```

## Configuration

After installation, publish the configuration file:

```bash

php artisan vendor:publish --provider="GregPriday\Scraper\ScraperServiceProvider"

```

This will create a `config/scraper.php` file where you can configure your scraping services.

## Configuring Scraping Services

Laravel Scraper supports multiple scraping services. Here's how to configure some popular ones:

### ScrapingBee

1. Sign up for an account at [ScrapingBee](https://www.scrapingbee.com/)

2. Get your API key from the dashboard

3. Add the following to your `.env` file:

```

SCRAPINGBEE_API_KEY=your_api_key_here

```

### Zyte (formerly Scrapy Cloud)

1. Create an account at [Zyte](https://www.zyte.com/)

2. Obtain your API key

3. Add to your `.env` file:

```

ZYTE_API_KEY=your_api_key_here

```

## Basic Usage

Here's how to make a basic scraping request:

```php

use GregPriday\Scraper\Facades\Scraper;

$response = Scraper::get('https://example.com');

$content = $response->getBody();

$statusCode = $response->getStatusCode();

$headers = $response->getHeaders();

```

## Using with Spatie Crawler

Laravel Scraper can be easily integrated with [Spatie's Crawler](https://github.com/spatie/crawler). Here's a quick example:

```php

use Spatie\Crawler\Crawler;

use GregPriday\Scraper\Facades\Scraper;

Crawler::create()

    ->setCrawlObserver(YourCrawlObserver::class)

    ->setClient(Scraper::getClient())

    ->startCrawling('https://example.com');

```

This sets up the crawler to use Laravel Scraper for all requests, benefiting from its multi-service capabilities and automatic retries.

## Advanced Configuration

You can add or modify scraping services in the `config/scraper.php` file. Each service can have its own configuration and priority:

```php

return [

    'scrapers' => [

        'scrapingbee' => [

            'driver' => 'scrapingbee',

            'api_key' => env('SCRAPINGBEE_API_KEY'),

            'priority' => 10,

        ],

        'zyte' => [

            'driver' => 'zyte',

            'api_key' => env('ZYTE_API_KEY'),

            'priority' => 20,

        ],

        // Add more scrapers here

    ],

];

```

The `priority` determines the order in which scrapers are attempted, with lower numbers being tried first.

## Creating Custom Scrapers

You can create custom scrapers by implementing the `ScraperInterface` or extending the `AbstractScraper` class.

## Error Handling

Laravel Scraper will automatically try the next scraper in the stack if one fails. You can catch exceptions at the application level:

```php

use GregPriday\Scraper\Exceptions\ScraperException;

try {

    $response = Scraper::get('https://example.com');

} catch (ScraperException $e) {

    // Handle the exception

}

```

## License

The Laravel Scraper package is open-sourced software licensed under the [MIT license](https://opensource.org/licenses/MIT).

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/gregpriday/laravel-scraper

Awesome Lists containing this project

README