{"id":13576633,"url":"https://github.com/GreenmaskIO/greenmask","last_synced_at":"2025-04-05T08:32:43.248Z","repository":{"id":216313744,"uuid":"726261658","full_name":"GreenmaskIO/greenmask","owner":"GreenmaskIO","description":"PostgreSQL database anonymization and synthetic data generation tool","archived":false,"fork":false,"pushed_at":"2024-10-22T06:38:33.000Z","size":23532,"stargazers_count":1030,"open_issues_count":30,"forks_count":18,"subscribers_count":4,"default_branch":"main","last_synced_at":"2024-10-23T09:33:26.444Z","etag":null,"topics":["anonymization","deterministic","dump","golang","masking","obfuscation","obfuscator","postgresql","restore","s3","security","security-tools","staging","synthetic-data","transform"],"latest_commit_sha":null,"homepage":"https://greenmask.io","language":"Go","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"apache-2.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/GreenmaskIO.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2023-12-01T22:19:27.000Z","updated_at":"2024-10-23T09:30:22.000Z","dependencies_parsed_at":"2024-01-09T15:43:11.330Z","dependency_job_id":"d096cd6f-0b8a-444a-8fb6-5a9fca4b055d","html_url":"https://github.com/GreenmaskIO/greenmask","commit_stats":null,"previous_names":["greenmaskio/greenmask"],"tags_count":21,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/GreenmaskIO%2Fgreenmask","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/GreenmaskIO%2Fgreenmask/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/GreenmaskIO%2Fgreenmask/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/GreenmaskIO%2Fgreenmask/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/GreenmaskIO","download_url":"https://codeload.github.com/GreenmaskIO/greenmask/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":223176488,"owners_count":17100638,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["anonymization","deterministic","dump","golang","masking","obfuscation","obfuscator","postgresql","restore","s3","security","security-tools","staging","synthetic-data","transform"],"created_at":"2024-08-01T15:01:12.248Z","updated_at":"2025-04-05T08:32:43.227Z","avatar_url":"https://github.com/GreenmaskIO.png","language":"Go","funding_links":[],"categories":["Go","Utilities","Data"],"sub_categories":["Generation/Masking/Subsetting"],"readme":"# [Greenmask](https://greenmask.io)\n\n## Dump anonymization and synthetic data generation tool\n\n**Greenmask** is a powerful open-source utility that is designed for logical database backup dumping,\nanonymization, synthetic data generation and restoration. It has ported PostgreSQL libraries, making it reliable.\nIt is stateless and does not require any changes to your database schema. It is designed to be highly customizable and\nbackward-compatible with existing PostgreSQL utilities, fast and reliable.\n\n[![Discord](https://img.shields.io/discord/1179422525294399488?label=Discord\u0026logo=discord)](https://discord.com/invite/rKBKvDECfd)\n[![Telegram](https://img.shields.io/badge/Telegram-Join%20Chat-blue.svg?logo=telegram)](https://t.me/greenmask_ru)\n[![X (formerly Twitter) Follow](https://img.shields.io/twitter/follow/GreenmaskIO)](https://twitter.com/GreenmaskIO)\n\n[![Documentation](https://img.shields.io/badge/docs-latest-blue)](https://docs.greenmask.io)\n[![License](https://img.shields.io/github/license/greenmaskio/greenmask)](https://github.com/greenmaskio/greenmask/blob/main/LICENSE)\n[![GitHub Release](https://img.shields.io/github/v/release/greenmaskio/greenmask)](https://github.com/greenmaskio/greenmask/releases/latest)\n[![GitHub Downloads (all assets, all releases)](https://img.shields.io/github/downloads/greenmaskio/greenmask/total)](https://somsubhra.github.io/github-release-stats/?username=greenmaskio\u0026repository=greenmask\u0026page=1\u0026per_page=5)\n[![Docker pulls](https://img.shields.io/docker/pulls/greenmask/greenmask)](https://hub.docker.com/r/greenmask/greenmask)\n[![Go Report Card](https://goreportcard.com/badge/github.com/greenmaskio/greenmask)](https://goreportcard.com/report/github.com/greenmaskio/greenmask)\n\n![schema.png](docs/assets/schema.png)\n\n## Getting started\n\nGreenmask has a [Playground](https://docs.greenmask.io/latest/playground/) - it is a sandbox environment in Docker with\nsample databases included to help you try Greenmask without any additional actions\n\n1. Clone the `greenmask` repository and navigate to its directory by running the following commands:\n\n    ```shell\n    git clone git@github.com:GreenmaskIO/greenmask.git \u0026\u0026 cd greenmask\n    ```\n\n2. Once you have cloned the repository, start the environment by running Docker Compose:\n\n    ```shell\n    docker-compose run greenmask\n    ```\n\n## Features\n\n* **[Database subset](https://docs.greenmask.io/latest/database_subset/)** - One of the most advanced subset systems \n  on the market. It supports **virtual references**, nullable columns, polymorphic references, and can subset even the \n  most complex schemas with **cyclic references**.\n* **[Deterministic transformers](https://docs.greenmask.io/latest/built_in_transformers/transformation_engines/#hash-engine)** — Uses hash functions to ensure consistent output for the same input. Most transformers support both `random` and\n  `hash` engines, offering flexibility for various use cases.\n* **[Dynamic parameters](https://docs.greenmask.io/latest/built_in_transformers/dynamic_parameters/)** — most\n  transformers support dynamic parameters, allowing them to adapt based on table column values. This feature helps\n  manage dependencies between columns and meet constraints effectively.\n* **[Transformation Condition](https://docs.greenmask.io/latest/built_in_transformers/transformation_condition/)** —\n  applies the transformation only when a specified condition is met, making it useful for targeting specific rows.\n* **[Transformation validation and easy maintenance](https://docs.greenmask.io/latest/commands/validate/)** — Greenmask\n  provides validation warnings, data transformation diffs, and schema diffs during configuration, enabling effective\n  monitoring and maintenance of transformations. The schema diff feature helps prevent data leakage when the schema\n  changes.\n* **[Transformation inheritance](https://docs.greenmask.io/latest/built_in_transformers/transformation_inheritance/)**\n  — transformation inheritance for partitioned tables and tables with foreign keys. Define once and apply to all.\n* **Stateless** — Greenmask operates as a logical dump and does not impact your existing database schema.\n* **Cross-platform** — Can be easily built and executed on any platform, thanks to its Go-based architecture,\n  which eliminates platform dependencies.\n* **Database type safe** — Ensures data integrity by validating data and using the database driver for encoding and\n  decoding operations, preserving accurate data formats.\n* **Backward compatible** — Fully supports the same features and protocols as standard PostgreSQL utilities. Dumps\n  created by Greenmask can be seamlessly restored using the `pg_restore` utility.\n* **Extensible** — Users have the flexibility\n  to [implement domain-based transformations](https://docs.greenmask.io/latest/built_in_transformers/standard_transformers/cmd/)\n  in any programming language or\n  use [predefined templates](https://docs.greenmask.io/latest/built_in_transformers/advanced_transformers/).\n* **Parallel execution** — Enables parallel dumping and restoration to significantly speed up results.\n* **Variety of storages** — Supports both local and remote storage, including directories and S3-compatible solutions.\n* **[Pgzip support for faster compression](https://docs.greenmask.io/latest/commands/dump/?h=pgzip#pgzip-compression)** — Speeds up dump and restoration processes with parallel compression \n  by setting `--pgzip`.\n\n## Use Cases\n\nGreenmask is ideal for various scenarios, including:\n\n* **Backup and Restoration**. Use Greenmask for your daily routines involving logical backup dumping and restoration. It\n  seamlessly handles tasks like table restoration after truncation. Its functionality closely mirrors that of pg_dump\n  and pg_restore, making it a straightforward replacement.\n* **Anonymization, Transformation, and Data Masking**. Employ Greenmask for anonymizing, transforming, and masking\n  backups, especially when setting up a staging environment or for analytical purposes. It simplifies the deployment of\n  a pre-production environment with consistently anonymized data, facilitating faster time-to-market in the development\n  lifecycle.\n\n### General Information\n\nThe best approach for logical backup dumping and restoration is to use core PostgreSQL utilities, specifically pg_dump\nand pg_restore. Greenmask is designed to align with these native tools, ensuring full compatibility. It independently\nmanages data dumping while delegating schema dumping and restoration to `pg_dump` and `pg_restore`, ensuring smooth\nintegration with PostgreSQL’s standard workflow.\n\nGreenmask utilizes the directory format of `pg_dump` and `pg_restore`, ideal for parallel execution and partial restoration.\nThis format includes metadata files to guide backup and restoration steps.\n\n#### Storage Options\n\n* **[s3](https://docs.greenmask.io/latest/configuration/#__tabbed_1_2)** - Supports any S3-compatible storage system,\n  including AWS S3, offering flexibility across different cloud storage solutions.\n* **[directory](https://docs.greenmask.io/latest/configuration/#__tabbed_1_1)** - This is the default option,\n  representing a standard filesystem directory for local storage.\n\n#### Data Anonymization and Validation\n\nGreenmask works with **COPY lines**, collects schema metadata using the Golang driver, and employs this driver in the\nencoding and decoding process. The **validate command** offers a way to assess the impact on both schema\n(**validation warnings**) and data (**transformation and displaying differences**). This command allows you to validate\nthe schema and data transformations, ensuring the desired outcomes during the Anonymization process.\n\n#### Customization\n\nIf your table schema relies on functional dependencies between columns, you can address this challenge using the\n[Dynamic parameters](https://docs.greenmask.io/latest/built_in_transformers/dynamic_parameters/). By setting dynamic\nparameters, you can resolve such as created_at and updated_at cases, where the\nupdated_at must be greater or equal than the created_at.\n\nIf you need to implement custom logic imperatively\nuse [Cmd](https://docs.greenmask.io/latest/built_in_transformers/standard_transformers/cmd/) or\n[TemplateRecord](https://docs.greenmask.io/latest/built_in_transformers/advanced_transformers/template_record/) or\n[Template](https://docs.greenmask.io/latest/built_in_transformers/advanced_transformers/template/) transformers.\n\n#### PostgreSQL Version Compatibility\n\n**Greenmask** is compatible with PostgreSQL versions **11 and higher**.\n\n## Links\n\n* [Documentation](https://docs.greenmask.io)\n* Email: **support@greenmask.io**\n* [Twitter](https://twitter.com/GreenmaskIO)\n* [Discord](https://discord.com/invite/rKBKvDECfd)\n* [Telegram [RU]](https://t.me/greenmask_ru)\n* [DockerHub](https://hub.docker.com/r/greenmask/greenmask)\n\n## References\n\n* Utilized the  [Demo database](https://postgrespro.com/community/demodb), provided by PostgresPro, for integration\n  testing purposes.\n* Employed the [adventureworks database](https://github.com/morenoh149/postgresDBSamples) created\n  by `morenoh149/postgresDBSamples`, in the Docker Compose playground.\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2FGreenmaskIO%2Fgreenmask","html_url":"https://awesome.ecosyste.ms/projects/github.com%2FGreenmaskIO%2Fgreenmask","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2FGreenmaskIO%2Fgreenmask/lists"}