https://github.com/kevindeyne/vardogr

Vardøgr is a CLI that can push production-like data to test environments securely and at scale
https://github.com/kevindeyne/vardogr

cli data-generation data-generator database mariadb mysql postgresql scrambled-data

Last synced: 3 months ago
JSON representation

Vardøgr is a CLI that can push production-like data to test environments securely and at scale

Host: GitHub
URL: https://github.com/kevindeyne/vardogr
Owner: kevindeyne
License: mit
Created: 2018-05-24T04:11:06.000Z (about 8 years ago)
Default Branch: master
Last Pushed: 2023-02-23T17:13:59.000Z (over 3 years ago)
Last Synced: 2023-03-04T05:29:04.260Z (over 3 years ago)
Topics: cli, data-generation, data-generator, database, mariadb, mysql, postgresql, scrambled-data
Language: Java
Homepage: https://kevindeyne.github.io/vardogr
Size: 1.12 MB
Stars: 1
Watchers: 1
Forks: 0
Open Issues: 16
Metadata Files:
- Readme: README.md
- License: LICENSE

Awesome Lists containing this project

README

# Vardøgr
![pre-release](https://github.com/kevindeyne/vardogr/workflows/pre-release/badge.svg)

Realistic test data in development and qa environments can pinpoint bugs and performance issues early. However taking direct copies violates the security of data and takes time. It also does not scale.

Vardøgr is a tool that can push production-like data to test databases securely. It does this by generating a distribution model of the data first - describing the data and its relative distribution.

It can then run this model and generate data from it, either directly matching the origin size or scaling up.

Image showcasing the description visually

## Limitations
Currently using JOOQ's open source version, which only allows for connecting with open source databases.
Only tested with MariaDB, MySQL and PostgreSQL.

## Commands
> build

Start with this command. This will build up the distribution model from the production database. It will ask you for read-only credentials.
Upon rerun, it will remember a valid configuration file and skip asking for credentials. Password is stored encrypted.

>generate --factor 2 --clean

This takes a distribution model and applies it to a lower environment database. It will ask for credentials which require write access.
There are two parameters:
- factor: Allows for scaling the model by a certain factor. Ie: generate --factor 2 will generate data 2x the size of the production data.
- clean: By default, the generation 'appends'. Ie if a production table contains 100 records and the same table contains 25 records in test, by default it will only add 75 new records. By explicitly defining the clean option, it will trunctate the data first and create 100 brand new records.

Alternatively, you can also use:
>generate --fill 3000

This also takes a distribution model and applies it to a lower environment database. It will ask for credentials which require write access.
- fill: Allows for scaling the model up to a certain record number. Ie: generate --fill 100 will generate data up to 100 records. If you already have data, those keep existing. You can use --clean to ensure it truncates data in the table before new data is added.

> help

You can always use help to get up to date documentation on available commands.

Image showcasing the usage visually

## How to build / run
This product uses Spring Shell and Maven. As such, you can build the project as such:
> mvn clean install

and run it as such:
> mvn spring-boot:run

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/kevindeyne/vardogr

Awesome Lists containing this project

README