An open API service indexing awesome lists of open source software.

https://github.com/scrapegraphai/scrapegraphai-php


https://github.com/scrapegraphai/scrapegraphai-php

Last synced: 5 months ago
JSON representation

Awesome Lists containing this project

README

          

# Scrapegraphai PHP API library

> [!NOTE]
> The Scrapegraphai PHP API Library is currently in **beta** and we're excited for you to experiment with it!
>
> This library has not yet been exhaustively tested in production environments and may be missing some features you'd expect in a stable release. As we continue development, there may be breaking changes that require updates to your code.
>
> **We'd love your feedback!** Please share any suggestions, bug reports, feature requests, or general thoughts by [filing an issue](https://www.github.com/stainless-sdks/scrapegraphai-php/issues/new).

The Scrapegraphai PHP library provides convenient access to the Scrapegraphai REST API from any PHP 8.1.0+ application.

It is generated with [Stainless](https://www.stainless.com/).

## Documentation

The REST API documentation can be found on [scrapegraphai.com](https://scrapegraphai.com).

## Installation

To use this package, install via Composer by adding the following to your application's `composer.json`:

```json
{
"repositories": [
{
"type": "vcs",
"url": "git@github.com:stainless-sdks/scrapegraphai-php.git"
}
],
"require": {
"org-placeholder/scrapegraphai": "dev-main"
}
}
```

## Usage

```php
smartscraper->create($params);

var_dump($completedSmartscraper->request_id);
```

### Handling errors

When the library is unable to connect to the API, or if the API returns a non-success status code (i.e., 4xx or 5xx response), a subclass of `Scrapegraphai\Errors\APIError` will be thrown:

```php
smartscraper->create($params);
} catch (APIConnectionError $e) {
echo "The server could not be reached", PHP_EOL;
var_dump($e->getPrevious());
} catch (RateLimitError $_) {
echo "A 429 status code was received; we should back off a bit.", PHP_EOL;
} catch (APIStatusError $e) {
echo "Another non-200-range status code was received", PHP_EOL;
var_dump($e->status);
}
```

Error codes are as follows:

| Cause | Error Type |
| ---------------- | -------------------------- |
| HTTP 400 | `BadRequestError` |
| HTTP 401 | `AuthenticationError` |
| HTTP 403 | `PermissionDeniedError` |
| HTTP 404 | `NotFoundError` |
| HTTP 409 | `ConflictError` |
| HTTP 422 | `UnprocessableEntityError` |
| HTTP 429 | `RateLimitError` |
| HTTP >= 500 | `InternalServerError` |
| Other HTTP error | `APIStatusError` |
| Timeout | `APITimeoutError` |
| Network error | `APIConnectionError` |

### Retries

Certain errors will be automatically retried 2 times by default, with a short exponential backoff.

Connection errors (for example, due to a network connectivity problem), 408 Request Timeout, 409 Conflict, 429 Rate Limit, >=500 Internal errors, and timeouts will all be retried by default.

You can use the `max_retries` option to configure or disable this:

```php
smartscraper
->create($params, new RequestOptions(maxRetries: 5));
```

## Advanced concepts

### Making custom or undocumented requests

#### Undocumented properties

You can send undocumented parameters to any endpoint, and read undocumented response properties, like so:

Note: the `extra_` parameters of the same name overrides the documented parameters.

```php
smartscraper
->create(
$params,
new RequestOptions(
extraQueryParams: ["my_query_parameter" => "value"],
extraBodyParams: ["my_body_parameter" => "value"],
extraHeaders: ["my-header" => "value"],
),
);

var_dump($completedSmartscraper["my_undocumented_property"]);
```

#### Undocumented request params

If you want to explicitly send an extra param, you can do so with the `extra_query`, `extra_body`, and `extra_headers` under the `request_options:` parameter when making a request, as seen in the examples above.

#### Undocumented endpoints

To make requests to undocumented endpoints while retaining the benefit of auth, retries, and so on, you can make requests using `client.request`, like so:

```php
request(
method: "post",
path: '/undocumented/endpoint',
query: ['dog' => 'woof'],
headers: ['useful-header' => 'interesting-value'],
body: ['hello' => 'world']
);
```

## Examples

The `examples/` directory contains comprehensive examples demonstrating various use cases:

### Basic Examples
- **[SmartScraper](examples/basic/basic-smartscraper.php)** - Extract data from web pages using natural language prompts
- **[Markdownify](examples/basic/basic-markdownify.php)** - Convert web pages to clean Markdown format
- **[SearchScraper](examples/basic/basic-searchscraper.php)** - Search and scrape data from multiple websites
- **[Crawl](examples/basic/basic-crawl.php)** - Systematically crawl entire websites
- **[Generate Schema](examples/basic/basic-schema.php)** - Generate JSON schemas from natural language
- **[Credits](examples/basic/basic-credits.php)** - Check your API credit balance
- **[Validate](examples/basic/basic-validate.php)** - Validate your API key

### Advanced Examples
- **[Advanced SmartScraper](examples/advanced/advanced-smartscraper.php)** - Complex schemas, JavaScript rendering, pagination
- **[Error Handling](examples/advanced/advanced-error-handling.php)** - Comprehensive error handling strategies

### Real-World Use Cases
- **[E-commerce Scraper](examples/use-cases/ecommerce-scraper.php)** - Product monitoring, price comparison, review analysis
- **[News Aggregator](examples/use-cases/news-aggregator.php)** - Multi-source news monitoring, sentiment analysis
- **[Job Listings](examples/use-cases/job-listings.php)** - Job search aggregation, salary benchmarking, skills analysis

### Quick Start Example

```php
smartscraper->create($params);
echo json_encode($result->data, JSON_PRETTY_PRINT);
```

For more examples and detailed documentation, see the [examples directory](examples/).

## Versioning

This package follows [SemVer](https://semver.org/spec/v2.0.0.html) conventions. As the library is in initial development and has a major version of `0`, APIs may change at any time.

This package considers improvements to the (non-runtime) PHPDoc type definitions to be non-breaking changes.

## Requirements

PHP 8.1.0 or higher.

## Contributing

See [the contributing documentation](https://github.com/stainless-sdks/scrapegraphai-php/tree/main/CONTRIBUTING.md).