{"id":14384093,"url":"https://github.com/aa8y/docker-dataset","last_synced_at":"2025-05-03T16:32:04.360Z","repository":{"id":79893859,"uuid":"97658551","full_name":"aa8y/docker-dataset","owner":"aa8y","description":"Docker database images with pre-populated data for testing and/or practice.","archived":false,"fork":false,"pushed_at":"2019-01-22T04:22:40.000Z","size":15,"stargazers_count":36,"open_issues_count":0,"forks_count":5,"subscribers_count":2,"default_branch":"master","last_synced_at":"2025-04-07T18:25:45.423Z","etag":null,"topics":["alpine","databases-populated","dataset","docker-image","dummy","lightweight","postgresql","thin"],"latest_commit_sha":null,"homepage":"https://hub.docker.com/r/aa8y/postgres-dataset/","language":"Dockerfile","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/aa8y.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null}},"created_at":"2017-07-19T01:19:24.000Z","updated_at":"2024-06-06T16:24:13.000Z","dependencies_parsed_at":null,"dependency_job_id":"5bed5109-5b36-4659-8940-dcc9f2997228","html_url":"https://github.com/aa8y/docker-dataset","commit_stats":null,"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/aa8y%2Fdocker-dataset","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/aa8y%2Fdocker-dataset/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/aa8y%2Fdocker-dataset/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/aa8y%2Fdocker-dataset/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/aa8y","download_url":"https://codeload.github.com/aa8y/docker-dataset/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":252216094,"owners_count":21713098,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["alpine","databases-populated","dataset","docker-image","dummy","lightweight","postgresql","thin"],"created_at":"2024-08-28T18:01:06.739Z","updated_at":"2025-05-03T16:32:04.070Z","avatar_url":"https://github.com/aa8y.png","language":"Dockerfile","readme":"# Docker Dataset\n\n[![Build Status](https://travis-ci.org/aa8y/docker-dataset.svg?branch=master)](https://travis-ci.org/aa8y/docker-dataset)\n\nHave you ever wanted to access pre-populated databases with dummy but valid data? It can be for something as simple as practicing writing SQL queries to running tests on databases. Under such circumstances, you have to either have to create dummy data or utilize some internet-searching skills to find data to populate your database. I think this is a common enough problem/requirement that solution can be Dockerized for reuse. So here is a Docker image for [PostgreSQL](https://www.postgresql.org/) with databases populated with sample data.\n\n## Datasets\n\nSo far we have the following datasets which are being used in the images.\n* [Postgres Sample Databases](https://wiki.postgresql.org/wiki/Sample_Databases): The datasets being used from here are `dellstore2` (tagged `dellstore`), `iso3166`,  `sportsdb`, `usda` and `world`. pgFoundry has been down for a few days now. Therefore we have switched the URLs to their FTP sources [here](https://www.postgresql.org/ftp/projects/pgFoundry/dbsamples/).\n\n## Databases\n\nThe only database supported so far is [PostgreSQL](https://www.postgresql.org/). We use the `alpine` version of the official image as the base image to keep our image slim.\n\n## Tags\n\nAvailable tags are `dellstore`, `iso3166`,  `sportsdb`, `usda`, `world`, `all` and `latest`. `all` and `latest` are the same image with all the datasets in one image. Each of them has been loaded into their own database in the image. The rest of the tags belong to images single datasets.\n\n### `pagila` has been removed\n\nThe `pagila` tag has been removed due to the fact that it was broken for a while and it also broke the `all` and `latest` tags. This is because the Pagila dataset we were using had a change which was not compatible with any version of Postgres (See [#1](https://github.com/aa8y/docker-dataset/issues/1) and [this issue](https://github.com/devrimgunduz/pagila/issues/6) for context.\n\n## Usage\n\nYou can start the container by running:\n```\ndocker run -d --name pg-ds-\u003ctag\u003e aa8y/postgres-dataset:\u003ctag\u003e\n```\nand access it by:\n```\ndocker exec -it pg-ds-\u003ctag\u003e psql -d \u003cdb_name\u003e\n```\nwhere `\u003ctag\u003e` is one of the tags mentioned [here](#tags) and `\u003cdb_name\u003e` is the database name which is one of the dataset names mentioned [here](#datasets). You can also use them with `docker-compose`. See [this example](https://github.com/aa8y/data-dude/blob/master/docker-compose.yml) for information on how to use them.\n\n## Custom images\n\nIf you want to build a custom image with not one or all the datasets, but some, then you can do that using:\n```\ndocker build -t aa8y/postgres-dataset:some --build-arg DATASETS=dellstore,world .\n```\nand then following the same [aforementioned](#usage) steps for using your custom image.\n\n## Future Work\n\n* Images for other popular databases like [MySQL](https://www.mysql.com/).\n* Add `french-towns-communes-francais` datasets from [Postgres Sample Databases](https://wiki.postgresql.org/wiki/Sample_Databases).\n* Find and add more free data sources.\n","funding_links":[],"categories":["Dockerfile"],"sub_categories":[],"project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Faa8y%2Fdocker-dataset","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Faa8y%2Fdocker-dataset","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Faa8y%2Fdocker-dataset/lists"}