https://github.com/bpolaszek/bentools-etl
PHP ETL (Extract / Transform / Load) library with SOLID principles + almost no dependency.
https://github.com/bpolaszek/bentools-etl
callable etl export extract extractor import input invoke load loader loop output pattern php transform transformer
Last synced: about 1 year ago
JSON representation
PHP ETL (Extract / Transform / Load) library with SOLID principles + almost no dependency.
- Host: GitHub
- URL: https://github.com/bpolaszek/bentools-etl
- Owner: bpolaszek
- License: mit
- Created: 2016-02-11T17:06:37.000Z (about 10 years ago)
- Default Branch: master
- Last Pushed: 2023-12-28T17:16:03.000Z (over 2 years ago)
- Last Synced: 2024-04-27T11:12:05.045Z (almost 2 years ago)
- Topics: callable, etl, export, extract, extractor, import, input, invoke, load, loader, loop, output, pattern, php, transform, transformer
- Language: PHP
- Homepage:
- Size: 333 KB
- Stars: 120
- Watchers: 6
- Forks: 4
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
[](https://packagist.org/packages/bentools/etl)
[](https://packagist.org/packages/bentools/etl)
[](https://github.com/bpolaszek/bentools-etl/actions/workflows/ci.yml)
[](https://codecov.io/gh/bpolaszek/bentools-etl)
[](https://packagist.org/packages/bentools/etl)
Okay, so you heard about the [Extract / Transform / Load](https://en.wikipedia.org/wiki/Extract,_transform,_load) pattern,
and you're looking for a PHP library to do the stuff. Alright, let's go!
`bentools/etl` is a versatile PHP library for implementing the Extract, Transform, Load (ETL) pattern, designed to streamline data processing tasks.
Table of Contents
-----------------
- [Concepts](#concepts)
- [Installation](#installation)
- [Getting started](#usage)
- [The EtlState object](doc/getting-started.md#the-etlstate-object)
- [Skipping items](doc/getting-started.md#skipping-items)
- [Stopping the workflow](doc/getting-started.md#stopping-the-workflow)
- [Using events](doc/getting-started.md#using-events)
- [Flush frequency and early flushes](doc/getting-started.md#flush-frequency-and-early-flushes)
- [Advanced Usage](doc/advanced_usage.md)
- [Creating your own Extractor / Transformers / Loaders](doc/advanced_usage.md#creating-your-own-extractor--transformers--loaders)
- [Difference between yield and return in transformers](doc/advanced_usage.md#difference-between-yield-and-return-in-transformers)
- [Next tick](doc/advanced_usage.md#next-tick)
- [Chaining extractors / transformers / loaders](doc/advanced_usage.md#chaining-extractors--transformers--loaders)
- [Reading from STDIN / Writing to STDOUT](doc/advanced_usage.md#reading-from-stdin--writing-to-stdout)
- [Instantiators](doc/advanced_usage.md#instantiators)
- [Using ReactPHP](doc/advanced_usage.md#using-reactphp-experimental)
- [Recipes](doc/recipes.md)
- [Contributing](#contribute)
- [License](#license)
Concepts
--------
Let's cover the basic concepts:
- **Extract**: you have a source of data (a database, a CSV file, whatever) - an **extractor** is able to read that data and provide an iterator of items
- **Transform**: apply transformation to each item. A **transformer** may generate 0, 1 or several items to **load** (for example, 1 item may generate multiple SQL queries)
- **Load**: load transformed item to the destination. For example, **extracted items** have been **transformed** to SQL queries, and your **loader** will run those queries against your database.
Installation
------------
```bash
composer require bentools/etl
```
> [!WARNING]
> Current version (4.0) is a complete redesign and introduces significant BC (backward compatibility) breaks.
> Avoid upgrading from `^2.0` or `^3.0` unless you're fully aware of the changes.
Usage
-----
Now let's have a look on how simple it is:
```php
use BenTools\ETL\EtlExecutor;
// Given
$singers = ['Bob Marley', 'Amy Winehouse'];
// Transform each singer's name to uppercase and process the array
$etl = (new EtlExecutor())
->transformWith(fn (string $name) => strtoupper($name));
// When
$report = $etl->process($singers);
// Then
var_dump($report->output); // ["BOB MARLEY", "AMY WINEHOUSE"]
```
OK, that wasn't really hard, here we basically don't have to _extract_ anything (we can already iterate on `$singers`),
and we're not _loading_ anywhere, except into PHP's memory.
You may ask, "why don't you just `array_map('strtoupper', $singers)` ?" and you're totally right.
But sometimes, extracting, transforming and / or loading get a little more complex.
You may want to extract from a file, a crawled content on the web, perform one to many transformations, maybe skip some items,
or reuse some extraction, transformation or loading logic.
Here's another example of what you can do:
```php
use BenTools\ETL\EventDispatcher\Event\TransformEvent;
use BenTools\ETL\Loader\JSONLoader;
use function BenTools\ETL\extractFrom;
$executor = extractFrom(function () {
yield ['firstName' => 'Barack', 'lastName' => 'Obama'];
yield ['firstName' => 'Donald', 'lastName' => 'Trump'];
yield ['firstName' => 'Joe', 'lastName' => 'Biden'];
})
->transformWith(fn (array $item) => implode(' ', array_values($item)))
->loadInto(new JSONLoader())
->onTransform(function (TransformEvent $event) {
if ('Donald Trump' === $event->transformResult->value) {
$event->state->skip();
}
});
$report = $executor->process();
dump($report->output); // string '["Barack Obama", "Joe Biden"]'
```
Or:
```php
$report = $executor->process(destination: 'file:///tmp/presidents.json');
var_dump($report->output); // string 'file:///tmp/presidents.json' - content has been written here
```
You get the point. Now you're up to write your own workflows!
Continue reading the [Getting Started Guide](doc/getting-started.md).
Contribute
----------
Contributions are welcome! Don't hesitate to suggest recipes.
This library is 100% covered with [Pest](https://pestphp.com) tests.
Please ensure to run tests using the command below and maintain code coverage before submitting PRs.
```bash
composer ci:check
```
License
-------
MIT.