{"id":13588845,"url":"https://github.com/openzim/mwoffliner","last_synced_at":"2025-05-15T01:10:14.007Z","repository":{"id":37587633,"uuid":"51273903","full_name":"openzim/mwoffliner","owner":"openzim","description":"MediaWiki scraper: all your wiki articles in one highly compressed ZIM file","archived":false,"fork":false,"pushed_at":"2025-05-09T18:10:26.000Z","size":10771,"stargazers_count":351,"open_issues_count":200,"forks_count":88,"subscribers_count":18,"default_branch":"main","last_synced_at":"2025-05-10T07:41:55.793Z","etag":null,"topics":["archive","mediawiki","nodejs","offline","openzim","scraper","wikipedia","zim"],"latest_commit_sha":null,"homepage":"https://www.npmjs.com/package/mwoffliner","language":"TypeScript","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"gpl-3.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/openzim.png","metadata":{"files":{"readme":"README.md","changelog":"Changelog","contributing":"CONTRIBUTING.md","funding":".github/FUNDING.yml","license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null},"funding":{"github":"kiwix","patreon":null,"open_collective":null,"ko_fi":null,"tidelift":null,"community_bridge":null,"liberapay":null,"issuehunt":null,"otechie":null,"custom":null}},"created_at":"2016-02-08T01:01:27.000Z","updated_at":"2025-05-09T18:10:29.000Z","dependencies_parsed_at":"2024-05-16T13:30:50.000Z","dependency_job_id":"7d07810b-ad86-478c-be32-5e1dadc4360a","html_url":"https://github.com/openzim/mwoffliner","commit_stats":{"total_commits":2699,"total_committers":64,"mean_commits":42.171875,"dds":0.7669507224898111,"last_synced_commit":"cec75bde634da818242a829798b589529fe2bff2"},"previous_names":["kiwix/mwoffliner"],"tags_count":69,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/openzim%2Fmwoffliner","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/openzim%2Fmwoffliner/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/openzim%2Fmwoffliner/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/openzim%2Fmwoffliner/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/openzim","download_url":"https://codeload.github.com/openzim/mwoffliner/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":253384053,"owners_count":21899926,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["archive","mediawiki","nodejs","offline","openzim","scraper","wikipedia","zim"],"created_at":"2024-08-01T15:06:58.703Z","updated_at":"2025-05-15T01:10:08.998Z","avatar_url":"https://github.com/openzim.png","language":"TypeScript","funding_links":["https://github.com/sponsors/kiwix"],"categories":["TypeScript"],"sub_categories":[],"readme":"# MWoffliner\n\nMWoffliner is a tool for making a local offline HTML snapshot of any\nonline [MediaWiki](https://mediawiki.org) instance. It goes through\nall online articles (or a selection if specified) and create the\ncorresponding [ZIM](https://openzim.org) file. It has mainly been\ntested against Wikimedia projects like\n[Wikipedia](https://wikipedia.org) and\n[Wiktionary](https://wiktionary.org) --- but it should also work for\nany recent MediaWiki.\n\nRead [CONTRIBUTING.md](./CONTRIBUTING.md) to know more about\nMWoffliner development.\n\nUser Help is available in the for a a\n[FAQ](https://github.com/openzim/mwoffliner/wiki/Frequently-Asked-Questions).\n\n[![NPM](https://nodei.co/npm/mwoffliner.png)](https://www.npmjs.com/package/mwoffliner)\n\n[![npm](https://img.shields.io/npm/v/mwoffliner.svg)](https://www.npmjs.com/package/mwoffliner)\n[![node](https://img.shields.io/node/v/mwoffliner.svg)](https://www.npmjs.com/package/mwoffliner)\n[![Docker](https://ghcr-badge.egpl.dev/openzim/mwoffliner/latest_tag?label=container)](https://ghcr.io/openzim/mwoffliner)\n[![Build Status](https://github.com/openzim/mwoffliner/workflows/CI/badge.svg?query=branch%3Amain)](https://github.com/openzim/mwoffliner/actions/workflows/ci.yml?query=branch%3Amain)\n[![codecov](https://codecov.io/gh/openzim/mwoffliner/branch/main/graph/badge.svg)](https://codecov.io/gh/openzim/mwoffliner)\n[![CodeFactor](https://www.codefactor.io/repository/github/openzim/mwoffliner/badge)](https://www.codefactor.io/repository/github/openzim/mwoffliner)\n[![License](https://img.shields.io/npm/l/mwoffliner.svg)](LICENSE)\n[![Join Slack](https://img.shields.io/badge/Join%20us%20on%20Slack%20%23mwoffliner-2EB67D)](https://slack.kiwix.org)\n\n## Features\n\n- Scrape with or without image thumbnail\n- Scrape with or without audio/video multimedia content\n- S3 cache (optional)\n- Image size optimiser / Webp converter\n- Scrape all articles in namespaces or title list based\n- Specify additional/non-main namespaces to scrape\n\nRun `mwoffliner --help` to get all the possible options.\n\n## Prerequisites\n\n- *NIX Operating System (GNU/Linux, macOS, ...)\n- [Redis](https://redis.io/)\n- [NodeJS](https://nodejs.org/en/) version 22 (we support only one single Node.JS version, other versions might work or not)\n- [Libzim](https://github.com/openzim/libzim) (On GNU/Linux \u0026 macOS we automatically download it)\n- Various build tools which are probably already installed on your\n  machine (packages `libjpeg-dev`, `libglu1`, `autoconf`, `automake`, `gcc` on\n  Debian/Ubuntu)\n\n... and an online MediaWiki with its API available.\n\n## Usage\n\nTo install MWoffliner globally:\n```bash\nnpm i -g mwoffliner\n```\n\nYou might need to run this command with the `sudo` command, depending\nhow your `npm` is configured.\n\n`npm` permission checking can be a bit annoying for a\nnewcomer. Please read the documentation carefully if you hit\nproblems: https://docs.npmjs.com/cli/v7/using-npm/scripts#user\n\nThen to run it:\n```bash\nmwoffliner --help\n```\n\nTo install and run it locally:\n```bash\nnpm i\nnpm run mwoffliner -- --help\n```\n\nTo use MWoffliner with a S3 cache, you should provide a S3 URL like\nthis:\n```bash\n--optimisationCacheUrl=\"https://wasabisys.com/?bucketName=my-bucket\u0026keyId=my-key-id\u0026secretAccessKey=my-sac\"\n```\n\n## API\n\nMWoffliner provides also an API and therefore can be used as a NodeJS\nlibrary. Here a stub example that could go in your index.mjs file:\n```javascript\nimport * as mwoffliner from 'mwoffliner';\n\nconst parameters = {\n    mwUrl: \"https://es.wikipedia.org\",\n    adminEmail: \"foo@bar.net\",\n    verbose: true,\n    format: \"nopic\",\n    articleList: \"./articleList\"\n};\nmwoffliner.execute(parameters); // returns a Promise\n```\n\n## Background\n\nComplementary information about MWoffliner:\n\n* MediaWiki software is used by thousands of wikis, the most\n  famous ones being the Wikimedia ones, including [Wikipedia](https://wikipedia.org).\n* MediaWiki is a PHP wiki runtime engine.\n* Wikitext is the name of the markup language that MediaWiki uses.\n* MediaWiki includes a parser for WikiText into HTML, and this\n  parser creates the HTML pages displayed in your browser.\n\nLicense\n-------\n\n[GPLv3](https://www.gnu.org/licenses/gpl-3.0) or later, see\n[LICENSE](LICENSE) for more details.\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fopenzim%2Fmwoffliner","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fopenzim%2Fmwoffliner","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fopenzim%2Fmwoffliner/lists"}