https://github.com/marcw/dgtools
Tools to work with Discogs data dumps
https://github.com/marcw/dgtools
discogs discogs-dump parquet postgresql
Last synced: 15 days ago
JSON representation
Tools to work with Discogs data dumps
- Host: GitHub
- URL: https://github.com/marcw/dgtools
- Owner: marcw
- License: mit
- Created: 2025-09-01T10:15:46.000Z (9 months ago)
- Default Branch: main
- Last Pushed: 2025-09-12T15:52:33.000Z (9 months ago)
- Last Synced: 2025-09-12T18:25:40.198Z (9 months ago)
- Topics: discogs, discogs-dump, parquet, postgresql
- Language: Go
- Homepage:
- Size: 33.2 KB
- Stars: 0
- Watchers: 0
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- License: LICENSE.md
Awesome Lists containing this project
README
# dgtools
A command line utility to work with the Discogs data dumps.
It makes it super easy to:
- List data dumps
- Download a specific dumps
- Convert dumps to ndjson or parquet
- Import a dump into a PostgreSQL database
## Usage
```
dgtools [global options] command [command options] [arguments...]
```
### Global Options
- `--discogs-bucket` - The URL of the Discogs data dumps (default: "https://discogs-data-dumps.s3.us-west-2.amazonaws.com")
## Commands
### dump
Work with Discogs data dump files.
#### dump list
List the files in the Discogs data dumps.
```
dgtools dump list [options]
```
**Options:**
- `--year` - Filter by year
- `--month` - Filter by month
- `--type` - Filter by data type
- `--no-table` - Don't print the table (output filenames only)
#### dump structure
Dump the structure of an XML file.
```
dgtools dump structure [options]
```
**Arguments:**
- `file` - The file to dump the structure of
**Options:**
- `--stop-after X` - Stops analysis after X records
#### dump download
Download a Discogs data dump.
```
dgtools dump download [options]
```
**Arguments:**
- `name` - The file to download
**Options:**
- `--out-dir` - The output directory (default: ".")
- `--overwrite` - Force the download even if the file already exists
- `--checksum` - Check the checksum of the file after downloading (default: true)
#### dump convert
Convert a dump to a different format
```
dgtools dump convert --out [options]
```
**Arguments:**
- `name` - The file to convert
**Options:**
- `--out` - The output file
- `--stop-after X` - Stop conversion after X records
### db
Work with a database.
**Options:**
- `--database-url` - The URL of the database to connect to (default: "postgres://$USER@localhost:5432/dgtools", can be set via DATABASE_URL environment variable)
#### db prepare
Prepare the database for import by running migrations.
```
dgtools db prepare
```
#### db import
Import data from a dump file to the database.
```
dgtools db import
```
**Arguments:**
- `file` - The file to import the data from
#### db nuke
Nuke the database by rolling back all migrations.
```
dgtools db nuke
```
## LICENSE
Please see [LICENSE.md](LICENSE.md)