Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.

Awesome Lists | Featured Topics | Projects

https://github.com/webfactory/slimdump

A tool for creating configurable dumps of large MySQL-databases.
https://github.com/webfactory/slimdump

database mysql php

Last synced: 6 days ago
JSON representation

A tool for creating configurable dumps of large MySQL-databases.

Awesome Lists containing this project

README

        

![webfactory Logo](https://www.webfactory.de/bundles/webfactorytwiglayout/img/logo.png) slimdump
========

[![Build Status](https://github.com/webfactory/slimdump/workflows/Run%20Tests/badge.svg)](https://github.com/webfactory/slimdump/actions)
[![Coverage Status](https://coveralls.io/repos/webfactory/slimdump/badge.svg?branch=master&service=github)](https://coveralls.io/github/webfactory/slimdump?branch=master)
![](https://github.com/webfactory/slimdump/workflows/AllDependenciesDeclared/badge.svg)

`slimdump` is a little tool to help you create configurable dumps of large MySQL-databases. It works off one or several configuration files. For every table you specify, it can dump only the schema (`CREATE TABLE ...` statement), full table data, data without blobs and more.

## Why?

We created `slimdump` because we often need to dump parts of MySQL databases in a convenient and reproducible way. Also, when you need to analyze problems with data from your production databases, you might want to pull only relevant parts of data and hide personal data (user names, for example).

`mysqldump` is a great tool, probably much more proven when it comes to edge cases and with a lot of switches. But there is no easy way to create a simple configuration file that describes a particular type of dump (e.g. a subset of your tables) and share it with your co-workers. Let alone dumping tables and omitting BLOB type columns.

## Installation

When PHP is your everyday programming language, you probably have [Composer](https://getcomposer.org) installed. You can then easily install `slimdump` as a [global package](https://getcomposer.org/doc/03-cli.md#global). Just run `composer global require webfactory/slimdump`. In order to use it like any other Unix command, make sure `$COMPOSER_HOME/vendor/bin` is in your `$PATH`.

Of course, you can also add `slimdump` as a local (per-project) Composer dependency.

We're also working on providing a `.phar` package of `slimdump` for those not using PHP regularly. With that solution, all you need is to have the PHP interpreter installed and to download a single archive file to use `slimdump`. You can help us and open a pull request for that :-)!

## Usage
`slimdump` needs the DSN for the database to dump and one or more config files:

`slimdump {DSN} {config-file} [...more config files...]`

`slimdump` writes to STDOUT. If you want your dump written to a file, just redirect the output:

`slimdump {DSN} {config-file} > dump.sql`

If you want to use an environment variable for the DSN, replace the first parameter with `-`:

`MYSQL_DSN={DSN} slimdump - {config file(s)}`

The DSN has to be in the following format:

`mysql://[user[:password]@]host[:port]/dbname[?charset=utf8mb4]`

For further explanations have a look at the [Doctrine documentation](https://www.doctrine-project.org/projects/doctrine-dbal/en/current/reference/configuration.html#connecting-using-a-url).

### Optional parameters and command line switches

#### no-progress

This turns off printing some progress information on `stderr`. Useful in scripting contexts.

Example:
`slimdump --no-progress {DSN} {config-file}`

#### buffer-size

You can also specify the buffer size, which can be useful on shared environments where your `max_allowed_packet` is low.
Do this by using the optional cli-option `buffer-size`. Add a suffix (KB, MB or GB) to the value for better readability.

Example:
`slimdump --buffer-size=16MB {DSN} {config-file}`

#### single-line-insert-statements

If you have tables with a large number of rows to dump and you are not planning to keep your dumps under version
control, you might consider writing each `INSERT INTO`-statement to a single line instead of one line per row. You can
do this by using the cli-parameter `single-line-insert-statements`. This can speed up the import significantly.

Example:
`slimdump --single-line-insert-statements {DSN} {config-file}`

#### output-csv

This option turns on the CSV (comma separated values) output mode. It must be given the path to a directory where `.csv` files will be created. The files are named according to tables, e. g. `my_table.csv`.

CSV files contain only data. They are not created for views, triggers, or tables dumped with the `schema` dump mode. Also, no files will be created for empty tables.

Since this output format needs to write to different files for different tables, redirecting `stdout` output (as can be done for the default MySQL SQL mode) is not possible.

**Experimental Feature** CSV support is a new, [experimental feature](https://github.com/webfactory/slimdump/pull/92). The output formatting may change at any time.

## Configuration
Configuration is stored in XML format somewhere in your filesystem. As a benefit, you could add the configuration to your repository to share a quickstart to your database dump with your coworkers.

Example:
```xml











```

### Conditions

You may want to select only some rows. In that case you can define a condition on a table.

```xml


```

In this example, only users with a username starting with 'foo' are exported:
A simple way to export roughly a percentage of the users is this:

```xml


```

This will export only the users with an id divisible by ten without a remainder, e.g. about 1/10th of the user rows (given
the ids are evenly distributed).

If you want to keep referential integrity, you might have to configure a more complex condition like this:

```xml


```

In this case, we export only users that are referenced in other tables, e.g. that are authors of blog posts or comments.

### Dump modes

The following modes are supported for the `dump` attribute:

* `none` - Table is not dumped at all. Makes sense if you use broad wildcards (see below) and then want to exclude a specific table.
* `schema` - Only the table schema will be dumped
* `noblob` - Will dump a `NULL` value for BLOB fields
* `full` - Whole table will be dumped
* `masked` - Replaces all chars with "x". Mostly makes sense when applied on the column level, for example for email addresses or user names.
* `replace` - When applied on a element, it replaces the values in this column with either a static value, or a nice dummy value generated by [Faker](https://github.com/fzaninotto/Faker/). Useful e.g. to replace passwords with a static one or to replace personal data like the first and last name with realistically sounding dummy data.

### Wildcards
Of course, you can use wildcards for table names (* for multiple characters, ? for a single character).

Example:
```xml





```
This is a valid configuration. If more than one instruction matches a specific table name, the most specific one will be used. E.g. if you have definitions for blog_* and blog_author, the latter will be used for your author table, independent of their sequence order in the config.

### Replacements

You probably don't want to use any personal data from your database. Therefore, slimdump allows you to replace data on
column level - a great instrument not only for General Data Protection Regulation (GDPR) compliance.

The simplest replacement is a static one:

```xml



```

This replaces the password values of all users with "test" (in clear text - but for sure you have [some sort of hashing in place](https://secure.php.net/manual/en/faq.passwords.php), do you?).

To achieve realistically sounding dummy data, slimdump also allows [basic Faker formatters](https://github.com/fzaninotto/Faker/#formatters).
You can use every Faker formatter which needs no arguments and modifiers such as `unique` (just seperate the modifier
with an object operator (`->`), as you would do in PHP). This is especially useful if your table has a unique constraint
on a column containing personal information, like the email address.

```xml







```

## Other databases
Currently, only MySQL is supported. Feel free to port it to the database of your needs.

## Development

### Building the Phar

* Make sure [Phive](https://phar.io/) is installed
* Run `phive install` to install tools, including [Box](https://github.com/humbug/box)
* Run `composer install --no-dev` to make sure the `vendor/` folder is up to date
* Run `tools/box compile` to build `slimdump.phar`.

### Tests

You can execute the phpunit-tests by calling `vendor/bin/phpunit`.

## Credits, Copyright and License

This tool was written by webfactory GmbH, Bonn, Germany. We're a software development agency with a focus on PHP (mostly [Symfony](http://github.com/symfony/symfony)). We're big fans of automation, DevOps, CI and CD, and of open source in general.

If you're a developer looking for new challenges, we'd like to hear from you! Otherwise, if this tool is useful for you, add a ⭐️.

-
-

Copyright 2014-2022 webfactory GmbH, Bonn. Code released under [the MIT license](LICENSE).