An open API service indexing awesome lists of open source software.

https://github.com/ovh/osarchiver

OpenStack databases archiver
https://github.com/ovh/osarchiver

archiver database openstack

Last synced: about 1 year ago
JSON representation

OpenStack databases archiver

Awesome Lists containing this project

README

          

# OSArchiver: OpenStack databases archiver

OSArchiver is a python package that aims to archive and remove soft deleted data from OpenStack databases.
The package is shiped with a main script called osarchiver that reads a configuration file and run the archivers.

# Philosophy

* OSArchiver doesn't have any knowledge of Openstack business objects
* OSArchiver purely relies on the common way of how OpenStack marks data as deleted by setting the column 'deleted_at' to a datetime.
It means that a row is archivable/removable if the 'deleted_at' column is not NULL

# Limitations

* Support Mysql/MariaDB as db backend.
* python >= 3.5

# Design

OSArchiver reads an INI configuration file in which you can define:

* archivers: a section that hold one source and a non mandatory list of destinations
* sources: a section that define a source of where the data should be read (basically the OS DB)
* destinations: a section that define where the data should be archived

# How does it works:

.----------.
.--------------------------| Archiver |-----------------------------.
| '----------' |
| |
| |
| |
v _______________ v
.--------. \ \ .-------------.
| Source |-------------------->) ARCHIVE DATA )------------------>| Desinations |
'--------' /______________/ '-------------'
| | |
| | |
| | |
| | |
| | |
| v |
| .--------------------------. |
v ( No error and delete_data=1 ) |
'--------------------------' |
_.-----._ | _.-----._ |
.- -. | .- -. | ___
|-_ _-| | |-_ _-| | | |\
| ~-----~ | | | ~-----~ |<--'->| ' ___
| | | | | | SQL| |\
`._ _.' | `._ _.' |____| '-|---.
"-----" | "-----" | CSV | |
OpenStack DB v Archiving DB |_____| |
^ _______________ v
| \ \ .-----------------------.
'-------------------------) DELETE DATA ) ( remote_store configured )
/______________/ '-----------------------'
|
v
__________
[_|||||||_°]
[_|||||||_°]
[_|||||||_°]

Remote Storage (Swift, ...)

# Installation

```
git clone https://github.com/ovh/osarchiver.git
cd osarchiver
pip install -r requirements.txt
pip setup.py install
```

# osarchiver script

```
# osarchiver --help
usage: osarchiver [-h] --config CONFIG [--log-file LOG_FILE]
[--log-level {info,warn,error,debug}] [--debug] [--dry-run]

optional arguments:
-h, --help show this help message and exit
--config CONFIG Configuration file to read
--log-file LOG_FILE Append log to the specified file
--log-level {info,warn,error,debug}
Set log level
--debug Enable debug mode
--dry-run Display what would be done without really deleting or
writing data
```

# Configuration
The configuation is an INI file containing several sections. You configure your
differents archivers in this configuration file. An example is available at the
root of the repository.

## DEFAULT section:
* Drescription: default section that define default/fallback value for options
* Format **[DEFAULT]**
* configuration parameters: all the parameters of archiver, source, destination
and backend section can be added in this section, those will be the fallback
value if the value is not set in a section.

## Archiver section:

* Description: defines where to read data and where to archive them and/or delete.
* Format **[archiver:*name*]**
* configuration parameters:
* **src**: name of the src section
* **dst**: comma separated list of destination section names
* **enable**: 1 or 0, if set to 0 the archiver is ignored and not run

Example:
```properties
[archiver:My_Archiver]
src: os_prod
dst: file, db

[src:os_prod]
...

[dst:file]
...

[dst:db]
....
```

## Source section:

* Description: defines where the OpenStack database are. It supports for now
one backend (db) but it may be easily extended
* Format **[src:*name*]**
* configuration parameters:
* **backend**: the name of backend to use, only `db` is supported
* **retention**: 12 MONTH
* **archive_data**: 0 or 1 if set to 1 expect a dest to archive the data else
won't run the archiving step just the delete step.
* **delete_data**: 0 or 1 if set to 1 will run the delete step. If the
archive step fails the delete step is not run to prevent loose of data.
* *backend specific options*

## Destination section:

* Description: defines where the data should be written. It supports for now
two backends (db for datatabase and file [csv, sql]) and may be extended
* Format **[dst:*name*]**
* configuration parameters:
* **backend**: the name of backend to use, `db` or `file`
* *backend specific options*

## Backends options:

### db
* Description: is the database (mysql/mariadb) backend
* options:
* **host**: DB host to connect to
* **port**: port of MariaDB server is running on
* **user**: login of MariaDB server to connect with
* **password**: password of user
* **delete_limit**: apply a LIMIT to DELETE statement
* **select_limit**: apply a LIMIT to SELECT statement
* **bulk_insert**: data are inserted in DB every builk_insert rows
* **deleted_column**: name of column that holds the date of soft delete, is
also used to filter table to archive, it means that the table must have
the deleted_column to be archived
* **where**: the literal SQL where applied to the select statement
Ex: where=${deleted_column} <= SUBDATE(NOW(), INTERVAL ${retention})
* **foreign_key_check**: true or false if set to false disable foreign key
check (default true)
* **retention**: how long time of data to keep in database (SQL format: 12
MONTH, 1 DAY, etc..)
* **excluded_databases**: comma, cariage return or semicolon separated
regexp of DB to exclude when specfiying '*' as database. The following DB
are akways ignored: 'mysql', 'performance_schema', 'information_schema'
* **excluded_tables**: comma, cariage return or semicolon separated regexp
of DB to exclude when specifying '*' as table. Ex: shadow_.*,.*_archived
* **db_suffix**: a non mendatory suffix to apply to the archiving DB. The
default suffix '_archive' is applied if you archive on same host than
source without setting a db_suffix or table_suffix (avoid reading and
writing on the same db.table)
* **table_suffix**: apply a suffix to the archiving table if specified

### file
* Description: is the file archiving destination type, it writes SQL data in a
file using one or several formats (supported: SQL, CSV)
* **directory**: the directory path where to archive data. You may use the
{date} keyword to append automaticaly the date to the directory path.
(/backup/archive_{date})
* **formats**: a comma, semicolon or cariage return separated list that
define the format in witch archive the data (csv, sql)

You've developed a new cool feature ? Fixed an annoying bug ? We'd be happy

to hear from you !

Have a look in [CONTRIBUTING.md](https://github.com/ovh/osarchiver/blob/master/CONTRIBUTING.md)

# Related links

* Contribute: https://github.com/ovh/osarchiver/blob/master/CONTRIBUTING.md
* Report bugs: https://github.com/ovh/osarchiver/issues

# License

See https://github.com/ovh/osarchiver/blob/master/LICENSE