{"id":24344328,"url":"https://github.com/aiven/aiven-db-migrate","last_synced_at":"2025-04-09T17:15:11.169Z","repository":{"id":37783660,"uuid":"252657379","full_name":"aiven/aiven-db-migrate","owner":"aiven","description":null,"archived":false,"fork":false,"pushed_at":"2025-03-20T10:39:49.000Z","size":200,"stargazers_count":21,"open_issues_count":12,"forks_count":10,"subscribers_count":67,"default_branch":"main","last_synced_at":"2025-04-09T17:15:01.929Z","etag":null,"topics":[],"latest_commit_sha":null,"homepage":null,"language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"apache-2.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/aiven.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":".github/CODEOWNERS","security":"SECURITY.md","support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2020-04-03T07:06:59.000Z","updated_at":"2025-03-10T11:11:30.000Z","dependencies_parsed_at":"2024-01-09T16:01:24.205Z","dependency_job_id":"0992f509-f094-42cb-a6b1-a3e40579128f","html_url":"https://github.com/aiven/aiven-db-migrate","commit_stats":null,"previous_names":[],"tags_count":13,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/aiven%2Faiven-db-migrate","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/aiven%2Faiven-db-migrate/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/aiven%2Faiven-db-migrate/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/aiven%2Faiven-db-migrate/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/aiven","download_url":"https://codeload.github.com/aiven/aiven-db-migrate/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":248074922,"owners_count":21043490,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":[],"created_at":"2025-01-18T09:33:48.706Z","updated_at":"2025-04-09T17:15:11.162Z","avatar_url":"https://github.com/aiven.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"# aiven-db-migrate\n\nAiven database migration tool. This tool is meant for easy migration of databases from some database service\nprovider, such AWS RDS, or on premises data center, to [Aiven Database as a Service](https://aiven.io/).\nHowever, it's not limited for Aiven services and it might be useful as a generic database migration tool.\n\nUsually database service providers, such as Aiven, AWS RDS, and alike, don't allow superuser/root access.\nInstead the service's master/admin user is granted permissions for the most common DBA tasks, see e.g.\nhttps://help.aiven.io/en/articles/489557-postgresql-superuser-access.\nIn addition, service provider's web console/API can be used for performing some DBA tasks requiring more privileges than\ngranted for the master/admin user. However, the missing superuser access makes some existing database migrating tools,\nsuch as `pg_dumpall`, not useful when migrating database to/from service provider. \n\nCurrently this tool supports only PostgreSQL but we aim to add support for other databases, such as MySQL.\n\nRequires Python 3.10 or newer.\n\n## Usage\n\nRunning library module:\n```\n$ python3 -m aiven_db_migrate.migrate -h\nAvailable commands: pg\n```\n\nInstalling in virtualenv:\n```\n$ python3 -m venv venv\n$ . venv/bin/activate\n$ ## Run make to set the proper version\n$ make\n$ pip install .\n```\n\nThis installs console scripts which have the same interface as the library module:\n * `pg_migrate`: PostgreSQL migration\n\n## PostgreSQL\n\nRequirements:\n * `pg_dump`: from any PostgreSQL version between the source and target versions\n * `psql`: any modern version should work\n\nRun library module:\n```\n$ python3 -m aiven_db_migrate.migrate pg -h\n```\nor, if installed:\n```\n$ pg_migrate -h\n```\n\nMigrating is supported to the same or newer PostgreSQL version starting from PostgreSQL 10 to PostgreSQL 14.\nMigrating to older version is not supported.\n\nBy default it searches `pg_dump` under `/usr/`, when using PostgreSQL installs on different directory such as on Mac, use `--pgbin` parameter to define PostgreSQL home directory. e,g,\n```\n--pgbin /Applications/Postgres.app/Contents/Versions/14/bin\n```\n\nSupports regular data dump (`pg_dump`) and [logical replication](https://www.postgresql.org/docs/current/logical-replication.html) (PostgreSQL 10 or newer).\nIn case that logical replication is not available or privileges/requirements are missing migrating falls back to\ndata dump.\n\n### CLI example\n\nMigrating from AWS RDS to Aiven for PostgreSQL. Logical replication is enabled in source AWS RDS PostgreSQL\nserver and `aiven-extras` extension is installed in target database.\n\n```\n$ pg_migrate -s \"postgres://postgres:\u003cpassword\u003e@jappja-pg1.chfhzaircbpb.eu-west-1.rds.amazonaws.com:5432/defaultdb\" -t \"postgres://avnadmin:\u003cpassword\u003e@pg1-test-jappja-test.avns.net:26192/defaultdb?sslmode=require\"\n\n# Or:\n$ SOURCE_SERVICE_URI=\"postgres://postgres:\u003cpassword\u003e@jappja-pg1.chfhzaircbpb.eu-west-1.rds.amazonaws.com:5432/defaultdb\" TARGET_SERVICE_URI=\"postgres://avnadmin:\u003cpassword\u003e@pg1-test-jappja-test.avns.net:26192/defaultdb?sslmode=require\" pg_migrate\n...\n\nRoles:\n  rolname: 'rdsadmin', rolpassword: None, status: 'failed', message: 'must be superuser to create superusers'\n  rolname: 'rds_password', rolpassword: None, status: 'created', message: 'role created'\n  rolname: 'rds_superuser', rolpassword: None, status: 'created', message: 'role created'\n  rolname: 'test_user1', rolpassword: 'placeholder_kfbqrvmdhgrpgpvy', status: 'created', message: 'role created'\n  rolname: 'rds_ad', rolpassword: None, status: 'created', message: 'role created'\n  rolname: 'rds_iam', rolpassword: None, status: 'created', message: 'role created'\n  rolname: 'rds_replication', rolpassword: None, status: 'created', message: 'role created'\n  rolname: 'rdsrepladmin', rolpassword: None, status: 'failed', message: 'must be superuser to create replication users'\n  rolname: 'postgres', rolpassword: None, status: 'exists', message: 'role already exists'\n  rolname: 'test_user2', rolpassword: None, status: 'created', message: 'role created'\n\nDatabases:\n  dbaname: 'rdsadmin', method: None, status: 'failed', message: 'FATAL:  pg_hba.conf rejects connection for host \"80.220.195.174\", user \"postgres\", database \"rdsadmin\", SSL on\\nFATAL:  pg_hba.conf rejects connection for host \"80.220.195.174\", user \"postgres\", database \"rdsadmin\", SSL off\\n'\n  dbaname: 'defaultdb', method: 'replication', status: 'running', message: 'migrated to existing database'\n```\n\nBy default logical replication is left running and the created pub/sub objects need to be cleaned up once workloads have been\nmoved to the new server. Objects created by this tool are named like `aiven_db_migrate_\u003cdbname\u003e_\u003csub|pub|slot\u003e`.\n\nStarting from the target (using `aiven-extras` extension), get first the subscription name:\n```\ndefaultdb= \u003e SELECT * FROM aiven_extras.pg_list_all_subscriptions();\n```\nand then drop it:\n```\ndefaultdb= \u003e SELECT * FROM aiven_extras.pg_drop_subscription('aiven_db_migrate_defaultdb_sub');\n```\n\nNote that with `aiven-extras` dropping subscription in target also drops replication slot in source (`dblink`).\n\nIn the source get first the publication name:\n```\ndefaultdb=\u003e SELECT * FROM pg_publication;\n```\nand then drop it:\n```\ndefaultdb=\u003e DROP PUBLICATION aiven_db_migrate_defaultdb_pub;\n```\n\nIn case that `aiven-extras` is not used clean up replication slot too:\n```\ndefaultdb=\u003e SELECT * FROM pg_replication_slots;\ndefaultdb=\u003e SELECT * FROM pg_drop_replication_slot('aiven_db_migrate_defaultdb_slot');\n```\n\nUsing `--max-replication-lag` waits until replication lag in bytes is less than/equal to given max replication lag. This\ncan be used together with `--stop-replication` to clean up all created pub/sub objects when replication is done.\n\nWith `--validate` only best effort validation is run. This checks e.g. PL/pgSQL languages, extensions etc. installed\nin source are also installed/available in target.\n\nUse `--no-replicate-extension-tables` to skip extension tables.  By default it attempts to replicate all extension tables during logical replication.\n\nWith `--force-method` you can specify if you wish to use either replication or dump method. Otherwise the most suitable method is chosen automatically.\n\nUsing `--dbs-max-total-size` together with `--validate` you can check if the size of the source database in below some threshold.\n\n### API example\n\nMigrating from AWS RDS to Aiven for PostgreSQL. Logical replication is enabled in source AWS RDS PostgreSQL\nserver but `aiven-extras` extension is not installed in target database so migrating falls back to data dump.\n\n```\n\u003e\u003e\u003e from aiven.migrate import PGMigrate, PGMigrateResult\n\u003e\u003e\u003e pg_mig = PGMigrate(source_conn_info=\"postgres://postgres:\u003cpassword\u003e@jappja-pg1.chfhzaircbpb.eu-west-1.rds.amazonaws.com:5432/defaultdb\", target_conn_info=\"postgres://avnadmin:\u003cpassword\u003e@pg2-test-jappja-test.avns.net:26192/defaultdb?sslmode=require\")\n\u003e\u003e\u003e result: PGMigrateResult = pg_mig.migrate()\n...\nLogical replication failed with error: 'must be superuser to create subscriptions', fallback to dump\n\u003e\u003e\u003e result\nPGMigrateResult(pg_databases={'rdsadmin': {'dbname': 'rdsadmin', 'message': 'FATAL:  pg_hba.conf rejects connection for host \"80.220.195.174\", user \"postgres\", database \"rdsadmin\", SSL on\\nFATAL:  pg_hba.conf rejects connection for host \"80.220.195.174\", user \"postgres\", database \"rdsadmin\", SSL off\\n', 'method': None, 'status': 'failed'}, 'defaultdb': {'dbname': 'defaultdb', 'message': 'migrated to existing database', 'method': 'dump', 'status': 'done'}}, pg_roles={'rdsadmin': {'message': 'must be superuser to create superusers', 'rolname': 'rdsadmin', 'rolpassword': None, 'status': 'failed'}, 'rds_password': {'message': 'role created', 'rolname': 'rds_password', 'rolpassword': None, 'status': 'created'}, 'rds_superuser': {'message': 'role created', 'rolname': 'rds_superuser', 'rolpassword': None, 'status': 'created'}, 'test_user1': {'message': 'role created', 'rolname': 'test_user1', 'rolpassword': 'placeholder_qkdryldfsrdaocio', 'status': 'created'}, 'rds_ad': {'message': 'role created', 'rolname': 'rds_ad', 'rolpassword': None, 'status': 'created'}, 'rds_iam': {'message': 'role created', 'rolname': 'rds_iam', 'rolpassword': None, 'status': 'created'}, 'rds_replication': {'message': 'role created', 'rolname': 'rds_replication', 'rolpassword': None, 'status': 'created'}, 'rdsrepladmin': {'message': 'must be superuser to create replication users', 'rolname': 'rdsrepladmin', 'rolpassword': None, 'status': 'failed'}, 'postgres': {'message': 'role already exists', 'rolname': 'postgres', 'rolpassword': None, 'status': 'exists'}, 'test_user2': {'message': 'role created', 'rolname': 'test_user2', 'rolpassword': None, 'status': 'created'}})\n```\n\n### Logical replication\n * requires PostgreSQL 10 or newer\n * `wal_level` needs to be `logical`\n * currently supports only FOR ALL TABLES publication in source\n * [aiven-extras](https://github.com/aiven/aiven-extras) extension installed in both source and target database, or\n * superuser or superuser-like privileges, such as `rds_replication` role in AWS RDS, in both source and target\n * [AWS RDS additional settings/info](https://docs.aws.amazon.com/AmazonRDS/latest/UserGuide/CHAP_PostgreSQL.html#PostgreSQL.Concepts.General.FeatureSupport.LogicalReplication)\n\n#### Warning\n\n⚠️ Large objects are [unable to be replicated using logical replication](https://www.postgresql.org/docs/15/logical-replication-restrictions.html), up to and including PostgreSQL 15.\n\n### Schemas\n * schemas are migrated without object ownership; the user used for migration is given all object ownership\n * NOTE: schema changes break logical replication\n\n### Roles\n * roles with `LOGIN` attribute are migrated using placeholder passwords: `placeholder_\u003c16 char random string\u003e`\n * migrating superuser or replication roles requires superuser privileges\n\n### Extensions\n * requires whitelisting the extension in target with [pgextwlist](https://github.com/dimitri/pgextwlist), or\n * superuser or superuser-like privileges\n * [Aiven for PostgreSQL supported extensions](https://help.aiven.io/en/articles/489561-supported-postgresql-extensions)\n\n## Development\n\nInstall build depends (Fedora):\n```\n$ make build-dep-fedora\n```\n\nStyle checks:\n```\n$ make validate-style\n```\n\nFix style errors with:\n```\n$ make isort\n$ make yapf\n```\n\nStatic checks (`flake8`, `pylint` and `mypy`):\n```\n$ make static-checks\n```\n\nTests (`pytest`):\n```\n$ make test\n```\n\nRunning whole test set takes time since all supported migration paths are tested. During development it's usually enough\nto run tests only for a certain PostgreSQL version, e.g.:\n```\n$ PG_VERSION=\"12\" make test\n```\n\nIt's also possible to test migration from one PostgreSQL version to another, e.g.:\n```\n$ PG_SOURCE_VERSION=\"10\" PG_TARGET_VERSION=\"12\" make test\n```\n\nTest set can be targeted even further by invoking `pytest`, e.g.:\n```\n$ PG_SOURCE_VERSION=\"10\" PG_TARGET_VERSION=\"12\" python3 -m pytest -s test/test_pg_migrate.py::Test_PGMigrate::test_migrate\n```\n\n# TODO\n\n * JSON output with CLI (for automation)\n   * Hard to make pg_dump silent for outputting JSON to stdout\n   * Output json to file instead?\n * More options\n   * --dump-only, --repl-only\n   * --include-databases, --exclude-databases\n   * --include-tables, --exclude-tables\n   * --role-passwords (role/passwords file for creating roles with real passwords instead of placeholders)\n * More tests\n   * Notably error/corner cases\n * Schema changes break logical replication\n   * While logical replication is running dump schema periodically and check if has changed,\n     e.g. by calculating hash of the schema dump\n   * How to continue if schema has changed? Stop replication, dump schema and restart replication?\n * Proper README + API doc\n * RPM build recipe for aiven-core/prune integration\n * Test automation: Jenkins/Github Actions\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Faiven%2Faiven-db-migrate","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Faiven%2Faiven-db-migrate","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Faiven%2Faiven-db-migrate/lists"}