{"id":20250628,"url":"https://github.com/svick/incremental-dumps","last_synced_at":"2025-10-04T18:54:19.302Z","repository":{"id":9479810,"uuid":"11366935","full_name":"svick/Incremental-Dumps","owner":"svick","description":null,"archived":false,"fork":false,"pushed_at":"2017-03-09T19:58:33.000Z","size":720,"stargazers_count":0,"open_issues_count":0,"forks_count":0,"subscribers_count":3,"default_branch":"gsoc","last_synced_at":"2025-03-03T16:27:07.250Z","etag":null,"topics":[],"latest_commit_sha":null,"homepage":null,"language":"C++","has_issues":false,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/svick.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null}},"created_at":"2013-07-12T11:09:25.000Z","updated_at":"2017-03-08T21:12:11.000Z","dependencies_parsed_at":"2022-09-15T22:12:35.946Z","dependency_job_id":null,"html_url":"https://github.com/svick/Incremental-Dumps","commit_stats":null,"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"purl":"pkg:github/svick/Incremental-Dumps","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/svick%2FIncremental-Dumps","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/svick%2FIncremental-Dumps/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/svick%2FIncremental-Dumps/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/svick%2FIncremental-Dumps/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/svick","download_url":"https://codeload.github.com/svick/Incremental-Dumps/tar.gz/refs/heads/gsoc","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/svick%2FIncremental-Dumps/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":278358489,"owners_count":25973949,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","status":"online","status_checked_at":"2025-10-04T02:00:05.491Z","response_time":63,"last_error":null,"robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":true,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":[],"created_at":"2024-11-14T09:59:21.589Z","updated_at":"2025-10-04T18:54:19.255Z","avatar_url":"https://github.com/svick.png","language":"C++","funding_links":[],"categories":[],"sub_categories":[],"readme":"[Incremental dumps][1] are an improved format for dumps of content from WikiMedia wikis.\n\n## Compiling\n\n### Linux\n\nYou will need cmake 2.8 and GCC 4.6 or newer.\nTo compile this project, run:\n\n    cmake .\n    make\n\n### Windows\n\nOpen the solution in Visual Studio 2012 (Visual Studio Express 2012 could work too, I haven't tested that) and build it.\n\n## Running the application\n\nCompiling produces a command-line application `idumps` (or `idumps.exe` on Windows).\n\nRunning it wihtout parameters produces a short usage message, explaining possible actions and the meaning of their parameters.\n\n### For dump readers\n\nThe following actions are useful for normal users of dumps, i.e. those who want to download and process them, not to create their own dumps.\nIn the future, there might be a special version of `idumps` that contains only these actions.\n\n#### Reading a dump\n\nThe `r` (or `read`) action is for reading a dump and converting it to XML.\nIt takes two parameters: path to the dump file and path to the generated XML file.\nIf the XML file already exists, it will be overwritten.\n\nExample:\n\n    idumps r dump.id dump.xml\n\n#### Applying a diff dump\n\nThe `a` (or `apply`) action is for applying diff dump to existing normal dump to update it.\nIt takes two parameters: path to the dump file and path to the diff dump file.\nIf the diff cannot be applied to the dump (because it's for different wiki, or for a dump with different timestamp), an error is printed.\nApplying the diff also updates the timestamp of the dump.\n\nExample:\n\n    idumps a dump.id diff.dd\n\n### For dump creators\n\nThe following actions are for dump creators, usually those who want to create a dump of their wiki.\n\n#### Updating a dump from wiki\n\nThe `u` (or `update`) action is for updating or creating a dump based on communication with a wiki.\nThe first five parameters are:\n\n* Name of the wiki (e.g. `enwiki`). This has to match with the name that's already in the dump, if it exists.\n\n* New timestamp for the updated dump. This has to be different than the current timestamp in the dump, if it exists.\n\n Timestamps are used in diff dumps, to make sure only the right diff can be applied to the dump.\n\n* Path to the PHP interpreter. This can be simply `php`, if `php` is in `PATH`.\n\n* Path to the dumpBackup maintenance script, possibly with optional parameters (e.g. `--report`).\n\n The script will be called with additional parameters `--full --stub`, which are added automatically.\n\n* Path to the fetchText maintenance script.\n\nThe remaining parameters specify what dumps to update/create, each group of two or three parameters represents separate dump.\n\nThe parameters in each group are:\n\n* Dump specification.\n\n* Path to the dump file.\n\n* Path to the created diff dump (if diff was included in the specification).\n\n If the file already exists, it will be overwritten.\n\nThe specification is a 2- to 4-letter string that describes what kind of dump to create:\n\n* 1. letter: `p` for pages dump or `s` for stub dump\n* 2. letter: `h` for history dump or `c` for current dump\n* 3. optional letter: `a` for articles dump (without talk and User namespaces)\n* 4. optional letter: `d` to also create diff dump\n\nExample:\n\n    idumps u enwiki 20130823 php \"/var/www/maintenance/dumpBackup.php --report=10000\" /var/www/maintenance/fetchText.php pca pages-articles.id shd stub-history.id stub-history.dd\n\nThis sets the name of the wiki to `enwiki`, timestamp to `20130823`. The PHP interpreter is in `PATH`, MediaWiki is installed in `/var/www` and dumpBackup will report each 10000 revisions.\n\nThe updated or created dumps are a pages-current-articles dump pages-articles.id and a stub-history dump stub-history.id that also has a diff dump stub-history.dd.\n\n### Creating a dump from XML\n\nThe `c` (or `create`) action is for creating incremental dump based on pages-history XML dump.\n\nThe parameters are similar as in the `update` action:\n\n* Name of the wiki.\n* Timestamp of the created dump.\n* Path to the source XML dump. If the path is `-`, the source XML is read from the standard input. This can be useful for reading a compressed XML file without an intermediary file.\n\nThe remaining parameters specify what dumps to creates, just as in `update`.\n\nThere is also an optional parameter `--report` which has to be followed by a number *n* specifying that progress should be reported every *n* revisions.\nIf *n* is 0, progress reporting is turned off. If this parameter is specified, it has to come right before the “name of the wiki” parameter.\n\nExample:\n\n    idumps c enwiki 20130823 enwiki-20130823-pages-meta-history.xml sc sc.id\n\n[1]: http://www.mediawiki.org/wiki/User:Svick/Incremental_dumps\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fsvick%2Fincremental-dumps","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fsvick%2Fincremental-dumps","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fsvick%2Fincremental-dumps/lists"}