{"id":50889481,"url":"https://github.com/echovisionlab/discogs-batch","last_synced_at":"2026-06-15T20:11:33.113Z","repository":{"id":47361762,"uuid":"242285880","full_name":"echovisionlab/discogs-batch","owner":"echovisionlab","description":"Discogs Data Dump Batch Module","archived":false,"fork":false,"pushed_at":"2022-12-02T03:02:41.000Z","size":7513,"stargazers_count":8,"open_issues_count":0,"forks_count":0,"subscribers_count":2,"default_branch":"main","last_synced_at":"2024-04-20T14:15:44.396Z","etag":null,"topics":["spring-batch"],"latest_commit_sha":null,"homepage":"","language":"Java","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/echovisionlab.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null}},"created_at":"2020-02-22T05:32:17.000Z","updated_at":"2024-03-26T06:49:08.000Z","dependencies_parsed_at":"2023-01-22T19:45:55.888Z","dependency_job_id":null,"html_url":"https://github.com/echovisionlab/discogs-batch","commit_stats":null,"previous_names":["echovisionlab/discogs-batch"],"tags_count":null,"template":false,"template_full_name":null,"purl":"pkg:github/echovisionlab/discogs-batch","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/echovisionlab%2Fdiscogs-batch","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/echovisionlab%2Fdiscogs-batch/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/echovisionlab%2Fdiscogs-batch/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/echovisionlab%2Fdiscogs-batch/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/echovisionlab","download_url":"https://codeload.github.com/echovisionlab/discogs-batch/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/echovisionlab%2Fdiscogs-batch/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":34378461,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-05-26T15:22:16.424Z","status":"online","status_checked_at":"2026-06-15T02:00:07.085Z","response_time":63,"last_error":null,"robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":true,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["spring-batch"],"created_at":"2026-06-15T20:11:32.033Z","updated_at":"2026-06-15T20:11:33.108Z","avatar_url":"https://github.com/echovisionlab.png","language":"Java","funding_links":[],"categories":[],"sub_categories":[],"readme":"# Discogs Batch\n\n[![codecov](https://codecov.io/gh/state303/discogs-batch/branch/master/graph/badge.svg?token=SKVQUX2TKB)](https://codecov.io/gh/state303/discogs-batch)\n[![Build Status](https://www.travis-ci.com/state303/discogs-batch.svg?branch=master)](https://www.travis-ci.com/state303/discogs-batch)\n\n\n### ANNOUNCEMENT ⛑\nIt has been 4 years passed and yet I am too occupied.\nAll PRs are welcome, and I hope I will revisit this repository very soon.\n\nIf there's any issue, or some enhancements or flavours to be present in this software, raise an issue :)\n\nUpdated ERD: [Click Here](https://dbdocs.io/state303/OpenDiscogs)\n\n### ABOUT THIS PROJECT\n\nThe aim of the project is to replicate the entire dump data set given\nfrom [data.discogs.com](https://data.discogs.com).\n\nIn summary, the batch operates as following:\n\n    - Currently only supports postgresql for higher stability and maintainability\n    - One shot process with validations before firing jobs\n    - Idempotent actions; may run several times for same source without issue\n    - Supports dockerize, docker run with predefined batch commands\n\n### Built With\n\n[Spring-Batch](https://spring.io/projects/spring-batch)\n\n[Liquibase](https://www.liquibase.org)\n\n[JOOQ](https://www.jooq.org)\n\n[ProgressBar](https://github.com/ctongfei/progressbar)\n\n## Batch Commands\n\nCommands will be accepted regardless of -- mark \u003cb\u003eONLY IF\u003c/b\u003e gets arguments directly from jar\nfile. Also, there is no impact from giving arguments in certain order. However, it will NOT accept\nany duplicated arguments.\n\ni.e. --m will work, as well as -m, m will.\n\nBrief summary for the commands are as below...\n\n|    NAME    | SYNONYM  |      REQUIRED         | MIN | MAX | FORMAT    | DEFAULT |  NOTE |\n|------------|----------|-----------------------|-----|-----|-----------|---------|-------|\n| username   | user, u  | :heavy_check_mark:    | 1   | 1   | STRING    | NULL                                |\n| password   | pass, p  | :heavy_check_mark:    | 1   | 1   | STRING    | NULL                                |\n| url        |          | :heavy_check_mark:    | 1   | 1   | addr:port | jdbc:postgresql://localhost:5432/discogs |\n| type       | t        | :black_square_button: | 1   | 4   | a,b,...   | ARTIST, MEMBER, LABEL, RELEASE_ITEM                     |\n| chunk_size | chunk, c | :black_square_button: | 1   | 1   | 0 \u003c N     | 3000    |\n| core_count | core     | :black_square_button: | 1   | 1   | 0 \u003c N     | 80% of core from runtime |\n| year       | y        | :black_square_button: | 1   | 1   | yyyy      | CURRENT | this or year_month.\n| year_month | ym       | :black_square_button: | 1   | 1   | yyyy-mm   | CURRENT | this or year.\n| etag       | e        | :black_square_button: | 1   | 4   | a,b,...   | MOST_RECENT | overrides type, date.\n| mount      | m        | :black_square_button: | 0   | 0   | NONE      | -       | keep dump file\n| strict     | s        | :black_square_button: | 0   | 0   | NONE      | -       | only perform specified type or ETag\n\n### Required Arguments\n\nIt is important to note that there are three required arguments.\n\n##### username\n\nUsername of the target database server. This will automatically be encoded to UTF-8. The user must\nhave sufficient permissions to create and modify the given schema or database.\n\n##### password\n\nPassword for the username given. This will automatically be encoded to UTF-8.\n\n##### url\n\nURL for the target database. The expected releaseFormat for the url would be...\n\n```text\n--url=jdbc://postgresql://{server_address}:{port}/{target_database}\n```\n\nIf you prefer to use specific database, please make sure to set it to the db prior to run batch,\notherwise the process will fail with messages.\n\nif target_database is missing, will be set to discogs as default.\n\nIt is important to note that if given schema or database is empty, this batch will automatically\ncreate tables via liquibase and sql.\n\n### Year, Year Month, Type and ETag.\n\nFirst and foremost, by specifying the ETag, any arguments given for year, year-month, type will be\nignored. This is intended behavior as each dump relies on other dump types in specified year and\nmonth.\n\nOther than ETag, it is important to note that providing both year and year-month at the same time is\nnot supported.\n\nFinally, types cannot be duplicated.\n\nIf you specify a year and a type for example, batch will automatically fetch and process the target\ndump INCLUDING the dependant dump.\n\n### Dependencies\n\nDump dependency for other type are can be described as following:\n\n|    TYPE   |       REQUIRES        |\n|-----------|-----------------------|\n| ARTIST    | -                     |\n| LABEL     | -                     |\n| MASTER    | ARTIST, LABEL         |\n| RELEASE   | ARTIST, LABEL, MASTER |\n\nThe job will always be executed by order as following:\n\n```text\nARTIST \u003e LABEL \u003e MASTER \u003e RELEASE\n```\n\nIf you run the batch with following arguments:\n\u003e url=[?] user=[?] pass=[?] year-month=2021-3 type=release\n\nBatch will be executed with artist, label, master, release dumps from 2021, March.\n\nIf you do not specify any options, but simply call the batch by username, password and url, then\nbatch will be executed with most recent artist, label, master, release dumps.\n\n### Mount and Strict\n\n##### Mount\n\nIf mount option is specified, the downloaded file from the discogs data will not be removed. This\nmaybe useful if you need to keep the downloaded dump.\n\n##### Strict\n\nThis option will not resolve any dependency, but to simply execute with given etag or type.\n\n### Concurrency\n\nThe application will automatically resolve the current core size of running system (currently 80%).\nIf core count argument will override the default setting, and validate the value accordingly.\n\nThe core count cannot exceed 80% of full core size of given machine, thus setting the value above\nwill simply be ignored.\n\nAlso, setting core count as negative value will also ignore the setting, which will simply set the\ncore count to default(80%).\n\n### Chunk Size\n\nThe default chunk-size is 500, however, in average environment, I would recommend to set to 100~200. This is totally up to the I/O spec and postgres settings of the running client and database server, so feel free to experiment with it.\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fechovisionlab%2Fdiscogs-batch","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fechovisionlab%2Fdiscogs-batch","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fechovisionlab%2Fdiscogs-batch/lists"}