{"id":14063375,"url":"https://github.com/chimpler/postgres-aws-s3","last_synced_at":"2025-09-01T20:13:44.358Z","repository":{"id":37484589,"uuid":"184955020","full_name":"chimpler/postgres-aws-s3","owner":"chimpler","description":"aws_s3 postgres extension to import/export data from/to s3 (compatible with aws_s3 extension on AWS RDS)","archived":false,"fork":false,"pushed_at":"2024-05-10T01:07:10.000Z","size":37,"stargazers_count":168,"open_issues_count":17,"forks_count":49,"subscribers_count":7,"default_branch":"master","last_synced_at":"2025-07-29T16:24:02.070Z","etag":null,"topics":["aws","aws-rds","boto3","export","extension","import","postgres","postgresql","postgresql-extension","rds","s3"],"latest_commit_sha":null,"homepage":"","language":"PLpgSQL","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"apache-2.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/chimpler.png","metadata":{"files":{"readme":"README.md","changelog":"CHANGELOG.md","contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2019-05-04T23:47:39.000Z","updated_at":"2025-07-28T18:42:31.000Z","dependencies_parsed_at":"2024-08-13T07:14:28.920Z","dependency_job_id":null,"html_url":"https://github.com/chimpler/postgres-aws-s3","commit_stats":null,"previous_names":[],"tags_count":1,"template":false,"template_full_name":null,"purl":"pkg:github/chimpler/postgres-aws-s3","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/chimpler%2Fpostgres-aws-s3","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/chimpler%2Fpostgres-aws-s3/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/chimpler%2Fpostgres-aws-s3/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/chimpler%2Fpostgres-aws-s3/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/chimpler","download_url":"https://codeload.github.com/chimpler/postgres-aws-s3/tar.gz/refs/heads/master","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/chimpler%2Fpostgres-aws-s3/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":273183228,"owners_count":25059812,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","status":"online","status_checked_at":"2025-09-01T02:00:09.058Z","response_time":120,"last_error":null,"robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":true,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["aws","aws-rds","boto3","export","extension","import","postgres","postgresql","postgresql-extension","rds","s3"],"created_at":"2024-08-13T07:03:18.514Z","updated_at":"2025-09-01T20:13:44.334Z","avatar_url":"https://github.com/chimpler.png","language":"PLpgSQL","readme":"# postgres-aws-s3\n\nStarting on Postgres version 11.1, AWS RDS added [support](https://docs.aws.amazon.com/AmazonRDS/latest/UserGuide/USER_PostgreSQL.S3Import.html#USER_PostgreSQL.S3Import.FileFormats) for S3 import using the extension `aws_s3`. It allows to import data from S3 within Postgres using the function `aws_s3.table_import_from_s3` and export the data to S3 using the function `aws_s3.query_export_to_s3`.\n\nIn order to support development either on RDS or locally, we implemented our own `aws_s3` extension that is similar to\nthe one provided in RDS. It was implemented in Python using the boto3 library.\n\n## Installation\nMake sure boto3 is installed using the default Python 3 installed on your computer.\nOn MacOS, this can be done as follows:\n\n    sudo /usr/bin/easy_install boto3\n\nThen clone the repository `postgres-aws-s3`:\n\n    git clone git@github.com:chimpler/postgres-aws-s3\n    \nMake sure that `pg_config` can be run:\n```\n$ pg_config \n\nBINDIR = /Applications/Postgres.app/Contents/Versions/13/bin\nDOCDIR = /Applications/Postgres.app/Contents/Versions/13/share/doc/postgresql\nHTMLDIR = /Applications/Postgres.app/Contents/Versions/13/share/doc/postgresql\nINCLUDEDIR = /Applications/Postgres.app/Contents/Versions/13/include\nPKGINCLUDEDIR = /Applications/Postgres.app/Contents/Versions/13/include/postgresql\nINCLUDEDIR-SERVER = /Applications/Postgres.app/Contents/Versions/13/include/postgresql/server\nLIBDIR = /Applications/Postgres.app/Contents/Versions/13/lib\n...\n```\n\nThen install `postgres-aws-s3`:\n\n    make install\n    \nFinally in Postgres:\n```postgresql\npsql\u003e CREATE EXTENSION plpython3u;\npsql\u003e CREATE EXTENSION aws_s3;\n``` \n\nIf you already have an old version of `aws_s3` installed, you might want to drop and recreate the extension:\n```postgresql\npsql\u003e DROP EXTENSION aws_s3;\npsql\u003e CREATE EXTENSION aws_s3;\n```\n    \n## Using aws_s3\n\n### Importing data using table_import_from_s3\n\nLet's create a table that will import the data from S3:\n```postgresql\npsql\u003e CREATE TABLE animals (\n    name TEXT,\n    age INT\n);\n```\n\nLet's suppose the following file is present in s3 at `s3://test-bucket/animals.csv`:\n```csv\nname,age\ndog,12\ncat,15\nparrot,103\ntortoise,205\n```\n\nThe function `aws_s3.table_import_from_s3` has 2 signatures that can be used.\n\n#### Using s3_uri and aws_credentials objects\n\n```postgresql\naws_s3.table_import_from_s3 (\n   table_name text, \n   column_list text, \n   options text, \n   s3_info aws_commons._s3_uri_1,\n   credentials aws_commons._aws_credentials_1,\n   endpoint_url text default null\n)\n```\n\nUsing this signature, the `s3_uri` and `aws_credentials` objects will need to be created first:\n\nParameter | Description\n----------|------------\ntable_name | the name of the table \ncolumn_list | list of columns to copy\noptions | options passed to the COPY command in Postgres\ns3_info | An aws_commons._s3_uri_1 composite type containing the bucket, file path and region information about the s3 object\ncredentials | An aws_commons._aws_credentials_1 composite type containing the access key, secret key, session token credentials\nendpoint_url | optional endpoint to use (e.g., `http://localhost:4566`)\n\n##### Example\n```postgresql\npsql\u003e SELECT aws_commons.create_s3_uri(\n   'test-bucket',\n   'animals.csv',\n   'us-east-1'\n) AS s3_uri \\gset\n\npsql\u003e \\echo :s3_uri\n(test-bucket,animals.csv,us-east-1)\n\npsql\u003e SELECT aws_commons.create_aws_credentials(\n   '\u003cmy_access_id\u003e',\n   '\u003cmy_secret_key\u003e',\n   '\u003csession_token\u003e'\n) AS credentials \\gset\n\npsql\u003e \\echo :credentials\n(\u003cmy_access_id\u003e,\u003cmy_secret_key\u003e,\u003csession_token\u003e)\n\npsql\u003e SELECT aws_s3.table_import_from_s3(\n   'animals',\n   '',\n   '(FORMAT CSV, DELIMITER '','', HEADER true)',\n   :'s3_uri',\n   :'credentials'\n);\n\n table_import_from_s3\n----------------------\n                    4\n(1 row)\n\npsql\u003e select * from animals;\n   name   | age\n----------+-----\n dog      |  12\n cat      |  15\n parrot   | 103\n tortoise | 205\n(4 rows)\n```\n\nYou can also call the function as:\n```\npsql\u003e SELECT aws_s3.table_import_from_s3(\n   'animals',\n   '',\n   '(FORMAT CSV, DELIMITER '','', HEADER true)',\n   aws_commons.create_s3_uri(\n      'test-bucket',\n      'animals.csv',\n      'us-east-1'\n   ),\n   aws_commons.create_aws_credentials(\n      '\u003cmy_access_id\u003e',\n      '\u003cmy_secret_key\u003e',\n      '\u003csession_token\u003e'\n   )\n);\n```\n\n#### Using the function table_import_from_s3 with all the parameters\n\n```postgresql\naws_s3.table_import_from_s3 (\n   table_name text,\n   column_list text,\n   options text,\n   bucket text,\n   file_path text,\n   region text,\n   access_key text,\n   secret_key text,\n   session_token text,\n   endpoint_url text default null\n) \n```\n\nParameter | Description\n----------|------------\ntable_name | the name of the table \ncolumn_list | list of columns to copy\noptions | options passed to the COPY command in Postgres\nbucket | S3 bucket\nfile_path | S3 path to the file\nregion | S3 region (e.g., `us-east-1`)\naccess_key | aws access key id\nsecret_key | aws secret key\nsession_token | optional session token\nendpoint_url | optional endpoint to use (e.g., `http://localhost:4566`)\n\n##### Example\n```postgresql\npsql\u003e SELECT aws_s3.table_import_from_s3(\n    'animals',\n    '',\n    '(FORMAT CSV, DELIMITER '','', HEADER true)',\n    'test-bucket',\n    'animals.csv',\n    'us-east-1',\n    '\u003cmy_access_id\u003e',\n    '\u003cmy_secret_key\u003e',\n    '\u003csession_token\u003e'\n);\n\n table_import_from_s3\n----------------------\n                    4\n(1 row)\n\npsql\u003e select * from animals;\n\n   name   | age\n----------+-----\n dog      |  12\n cat      |  15\n parrot   | 103\n tortoise | 205\n(4 rows)\n```\n\nIf you use localstack, you can set `endpoint_url` to point to the localstack s3 endpoint:\n```\npsql\u003e SET aws_s3.endpoint_url TO 'http://localstack:4566'; \n```\n\nYou can also set the AWS credentials:\n```\npsql\u003e SET aws_s3.access_key_id TO 'dummy';\npsql\u003e SET aws_s3.secret_key TO 'dummy';\npsql\u003e SET aws_s3.session_token TO 'dummy';\n```\nand then omit them from the function calls.\n\nFor example:\n```\npsql\u003e SELECT aws_s3.table_import_from_s3(\n    'animals',\n    '',\n    '(FORMAT CSV, DELIMITER '','', HEADER true)',\n    'test-bucket',\n    'animals.csv',\n    'us-east-1'\n);\n```\n\nYou can pass them also as optional parameters. For example:\n```\npsql\u003e SELECT aws_s3.table_import_from_s3(\n    'animals',\n    '',\n    '(FORMAT CSV, DELIMITER '','', HEADER true)',\n    'test-bucket',\n    'animals.csv',\n    'us-east-1',\n    endpoint_url := 'http://localstack:4566'\n);\n```\n\n#### Support for gzip files\n\nIf the file has the metadata `Content-Encoding=gzip` in S3, then the file will be automatically unzipped prior to be copied to the table.\nOne can update the metadata in S3 by following the instructions described [here](https://docs.aws.amazon.com/AmazonS3/latest/user-guide/add-object-metadata.html).\n\n\n### Exporting data using query_export_to_s3\n\nDocumentation: https://docs.aws.amazon.com/AmazonRDS/latest/AuroraUserGuide/postgresql-s3-export.html\n\nSimilarly to the import functions, you can export the data using different methods.\n\n#### Using s3_uri and aws_credentials objects\n\n```\naws_s3.query_export_to_s3(\n    query text,    \n    s3_info aws_commons._s3_uri_1,\n    credentials aws_commons._aws_credentials_1 default null,\n    options text default null, \n    endpoint_url text default null\n)\n```\n\nUsing this signature, the `s3_uri` and optionally `aws_credentials` objects will need to be created first:\n\nParameter | Description\n----------|------------\nquery | query that returns the data to export\ns3_info | An aws_commons._s3_uri_1 composite type containing the bucket, file path and region information about the s3 object\ncredentials | An aws_commons._aws_credentials_1 composite type containing the access key, secret key, session token credentials\noptions | options passed to the COPY command in Postgres\nendpoint_url | optional endpoint to use (e.g., `http://localhost:4566`)\n\n##### Example\n```postgresql\npsql\u003e SELECT * FROM aws_s3.query_export_to_s3(\n   'select * from animals',\n   aws_commons.create_s3_uri(\n      'test-bucket',\n      'animals2.csv',\n      'us-east-1'\n   ),\n   aws_commons.create_aws_credentials(\n      '\u003cmy_access_id\u003e',\n      '\u003cmy_secret_key\u003e',\n      '\u003csession_token\u003e'\n   ),\n   options := 'FORMAT CSV, DELIMITER '','', HEADER true'\n);\n```\nIf you set the AWS credentials:\n```\npsql\u003e SET aws_s3.aws_s3.access_key_id TO 'dummy';\npsql\u003e SET aws_s3.aws_s3.secret_key TO 'dummy';\npsql\u003e SET aws_s3.session_token TO 'dummy';\n```\n\nYou can omit the credentials.\n\n##### Example\n\n#### Using the function table_import_from_s3 with all the parameters\n```\naws_s3.query_export_to_s3(\n    query text,    \n    bucket text,    \n    file_path text,\n    region text default null,\n    access_key text default null,\n    secret_key text default null,\n    session_token text default null,\n    options text default null, \n    endpoint_url text default null   \n)\n```\n\nParameter | Description\n----------|------------\nquery | query that returns the data to export\nbucket | S3 bucket\nfile_path | S3 path to the file\nregion | S3 region (e.g., `us-east-1`)\naccess_key | aws access key id\nsecret_key | aws secret key\nsession_token | optional session token\noptions | options passed to the COPY command in Postgres\nendpoint_url | optional endpoint to use (e.g., `http://localhost:4566`)\n\n##### Example\n```postgresql\npsql\u003e SELECT * FROM aws_s3.query_export_to_s3(\n   'select * from animals',\n   'test-bucket',\n   'animals.csv',\n   'us-east-1',\n    '\u003cmy_access_id\u003e',\n    '\u003cmy_secret_key\u003e',\n    '\u003csession_token\u003e',\n   options:='FORMAT CSV, HEADER true'\n);\n\n rows_uploaded | files_uploaded | bytes_uploaded\n---------------+----------------+----------------\n             5 |              1 |             47\n```\n\nIf you set the AWS credentials:\n```\npsql\u003e SET aws_s3.aws_s3.access_key_id TO 'dummy';\npsql\u003e SET aws_s3.aws_s3.secret_key TO 'dummy';\npsql\u003e SET aws_s3.session_token TO 'dummy';\n```\n\nYou can omit the credential fields.\n\n### Docker Compose\n\nWe provide a docker compose config to run localstack and postgres in docker containers. To start it:\n```\n$ docker-compose up\n```\n\nIt will initialize a s3 server on port 4566 with a bucket test-bucket:\n```\naws s3 --endpoint-url=http://localhost:4566 ls s3://test-bucket\n```\n\nYou can connect to the postgres server:\n```\n$ psql -h localhost -p 15432 -U test test \n(password: test)\n```\n\nInitialize the extensions:\n```\npsql\u003e CREATE EXTENSION plpythonu;\npsql\u003e CREATE EXTENSION aws_s3;\n```\n\nSet the endpoint url and the aws keys to use s3 (in localstack you can set the aws creds to any non-empty string):\n```\npsql\u003e SET aws_s3.endpoint_url TO 'http://localstack:4566';\npsql\u003e SET aws_s3.aws_access_key_id TO 'dummy';\npsql\u003e SET aws_s3.secret_access_key TO 'dummy';\n```\n\nCreate a table animals:\n```\npsql\u003e CREATE TABLE animals (\n    name TEXT,\n    age INT\n);\n\npsql\u003e INSERT INTO animals (name, age) VALUES\n('dog', 12),\n('cat', 15),\n('parrot', 103),\n('tortoise', 205);\n```\n\nExport it to s3:\n```\npsql\u003e select * from aws_s3.query_export_to_s3('select * from animals', 'test-bucket', 'animals.csv', 'us-east-1', options:='FORMAT CSV, HEADER true');\n rows_uploaded | files_uploaded | bytes_uploaded\n---------------+----------------+----------------\n             5 |              1 |             47\n```\n\nImport it back to another table:\n```\npsql\u003e CREATE TABLE new_animals (LIKE animals);\npsql\u003e select * from aws_s3.query_export_to_s3('select * from animals', 'test-bucket', 'animals.csv', 'us-east-1', options:='FORMAT CSV, HEADER true');\n rows_uploaded | files_uploaded | bytes_uploaded\n---------------+----------------+----------------\n             4 |              1 |             38\n\npsql\u003e SELECT aws_s3.table_import_from_s3(\n    'new_animals',\n    '',\n    '(FORMAT CSV, HEADER true)',\n    'test-bucket',\n    'animals.csv', 'us-east-1'\n);\n table_import_from_s3\n----------------------\n                    4\n(1 row)\n\npsql\u003e SELECT * FROM new_animals;\n   name   | age\n----------+-----\n dog      |  12\n cat      |  15\n parrot   | 103\n tortoise | 205\n(4 rows)\n```\n\n## Contributors\n\n* Oleksandr Yarushevskyi ([@oyarushe](https://github.com/oyarushe))\n* Stephan Huiser ([@huiser](https://github.com/huiser))\n* Jan Griesel ([@phileon](https://github.com/phileon))\n* Matthew Painter ([@mjgp2](https://github.com/mjgp2))\n* Justin Leto ([@jleto](https://github.com/jleto))\n\n\n## Thanks\n\n* Thomas Gordon Lowrey IV [@gordol](https://github.com/gordol)\n\n","funding_links":[],"categories":["PLpgSQL"],"sub_categories":[],"project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fchimpler%2Fpostgres-aws-s3","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fchimpler%2Fpostgres-aws-s3","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fchimpler%2Fpostgres-aws-s3/lists"}