{"id":18324289,"url":"https://github.com/maxwell175/stackexchangedumpconverter","last_synced_at":"2025-04-13T00:15:13.263Z","repository":{"id":250637286,"uuid":"834958177","full_name":"Maxwell175/StackExchangeDumpConverter","owner":"Maxwell175","description":"A tool to convert the Stack Exchange site dumps to various destinations.","archived":false,"fork":false,"pushed_at":"2024-08-19T02:42:54.000Z","size":59,"stargazers_count":6,"open_issues_count":0,"forks_count":2,"subscribers_count":2,"default_branch":"master","last_synced_at":"2025-04-13T00:15:06.180Z","etag":null,"topics":[],"latest_commit_sha":null,"homepage":null,"language":"C#","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"agpl-3.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/Maxwell175.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2024-07-28T20:24:40.000Z","updated_at":"2025-02-18T14:30:06.000Z","dependencies_parsed_at":"2024-08-08T01:56:46.298Z","dependency_job_id":"8fd83e08-a34a-454f-8933-b5b68498d236","html_url":"https://github.com/Maxwell175/StackExchangeDumpConverter","commit_stats":null,"previous_names":["maxwell175/stackexchangedumpconverter"],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Maxwell175%2FStackExchangeDumpConverter","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Maxwell175%2FStackExchangeDumpConverter/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Maxwell175%2FStackExchangeDumpConverter/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Maxwell175%2FStackExchangeDumpConverter/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/Maxwell175","download_url":"https://codeload.github.com/Maxwell175/StackExchangeDumpConverter/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":248647274,"owners_count":21139086,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":[],"created_at":"2024-11-05T18:33:33.241Z","updated_at":"2025-04-13T00:15:13.234Z","avatar_url":"https://github.com/Maxwell175.png","language":"C#","funding_links":[],"categories":[],"sub_categories":[],"readme":"# StackExchange Dump Converter\n\nThis is a tool that takes a StackExchange site data dump and imports it to various destinations.\n\n## Destinations\n\nAll relational database destinations include foreign keys.\n\n* Relational Databases\n  * PostgreSQL\n  * SQL Server\n  * SQLite\n\n## Data Fixes\n\nThis tool detects references to missing Posts or Users and adds dummy records to compensate.\n\n## Compiling\n\n1. Make sure you have .NET 8.0 SDK installed.\n2. Run `dotnet build` at the root of this repo.\n\n## How to Use\n\nSee the help output by running the tool with the `-h` option.\n\nHere is a sample command to import the dump of the Unix SE site to a local PostgreSQL instance with a large batch size:\n\n```shell\n./bin/Release/net8.0/StackExchangeDumpConverter \\                  \n    -d postgres --postgres-user testuser --postgres-pass testpass \\\n    --postgres-db unix --postgres-replace true --postgres-batch-size 1000000 unix.stackexchange.com.7z\n```\n\nThere is also a sample `convertall.sh` file that the reader can adapt to convert all \ndata dumps in an automated manner.\n\n## Performance\n\nTo improve the performance of the load, it is recommended to increase the batch size. However this comes at \na cost of increased memory usage.\n\nLoading the StackOverflow data dump using a batch size of 1000000 takes 12 hr 6 min with a peak memory usage of 7.7GB.\n\n## Pre-converted files\n\nThis is a list of magnet links to torrent downloads for pre-converted dumps using this tool.\n\n* April 2024 Data Dump\n  * PostgreSQL: `magnet:?xt=urn:btih:3e61815212358b0a677fa3ed11963d0bcdc0c31d\u0026dn=stackexchange_postgresql\u0026tr=http%3A%2F%2Fbt2.archive.org%3A6969%2Fannounce\u0026tr=http%3A%2F%2Fbt1.archive.org%3A6969%2Fannounce`\n  * SQLite: `magnet:?xt=urn:btih:a9396346c837c087184158b4dbe60f7dedb77e06\u0026dn=stackexchange_sqlite\u0026tr=http%3A%2F%2Fbt1.archive.org%3A6969%2Fannounce\u0026tr=http%3A%2F%2Fbt2.archive.org%3A6969%2Fannounce`\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fmaxwell175%2Fstackexchangedumpconverter","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fmaxwell175%2Fstackexchangedumpconverter","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fmaxwell175%2Fstackexchangedumpconverter/lists"}