{"id":15696810,"url":"https://github.com/ironholds/mwparser","last_synced_at":"2025-05-08T23:28:47.281Z","repository":{"id":77017917,"uuid":"92435981","full_name":"Ironholds/mwparser","owner":"Ironholds","description":"A parser for Wikimarkup","archived":false,"fork":false,"pushed_at":"2017-08-02T01:51:30.000Z","size":30,"stargazers_count":7,"open_issues_count":0,"forks_count":1,"subscribers_count":3,"default_branch":"master","last_synced_at":"2025-03-31T19:21:19.550Z","etag":null,"topics":["parser","r","wikimarkup","wikimedia","wikipedia"],"latest_commit_sha":null,"homepage":null,"language":"R","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"other","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/Ironholds.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2017-05-25T19:12:35.000Z","updated_at":"2021-07-16T15:32:04.000Z","dependencies_parsed_at":null,"dependency_job_id":"94a7422b-a532-4b2a-b9b6-e5acc742f002","html_url":"https://github.com/Ironholds/mwparser","commit_stats":null,"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Ironholds%2Fmwparser","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Ironholds%2Fmwparser/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Ironholds%2Fmwparser/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Ironholds%2Fmwparser/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/Ironholds","download_url":"https://codeload.github.com/Ironholds/mwparser/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":253162682,"owners_count":21863960,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["parser","r","wikimarkup","wikimedia","wikipedia"],"created_at":"2024-10-03T19:10:01.482Z","updated_at":"2025-05-08T23:28:47.263Z","avatar_url":"https://github.com/Ironholds.png","language":"R","funding_links":[],"categories":[],"sub_categories":[],"readme":"## Wikimarkup parsing in R\nA package for parsing, chucking and modifying wikimarkup in R.\n\n__Author:__ Oliver Keyes\u003cbr/\u003e\n__License:__ [MIT](http://opensource.org/licenses/MIT)\u003cbr/\u003e\n__Status:__ In development\n\n[![Travis-CI Build Status](https://travis-ci.org/Ironholds/mwparser.svg?branch=master)](https://travis-ci.org/Ironholds/mwparser)\n\n### Description\n\nWikimarkup is the language used on Wikipedia and similar projects, and as such contains\na lot of valuable data both for scientists studying collaborative systems and people\nstudying things documented on or in Wikipedia. `mwparser` parses wikimarkup, allowing a\nuser to filter down to specific types of tags such as links or templates, and then extract components of those tags.\n\n### Example\n\n```\nlibrary(mwparser)\nlibrary(magrittr)\nwikitext \u003c- \"this is wikitext with \\n [[a|link]] [[or|two]]\"\n\nlink_paths \u003c- parse_wikitext(wikitext) %\u003e%\n  get_wikilinks %\u003e%\n  wikilink_paths(text = TRUE)\n\nlink_paths\n[1] \"a\"  \"or\"\n```\n\n### Installation\n\n`mwparser` depends on two things; the [reticulate](https://rstudio.github.io/reticulate/) R package and the Python library [mwparserfromhell](https://github.com/earwig/mwparserfromhell). To install the whole stack, assuming you have `pip`:\n\n```\n# In the terminal\npip install mwparserfromhell\n\n# In R\ninstall.packages(\"reticulate\")\ndevtools::install_github(\"ropenscilabs/mwparser\")\n```\n\nWith that, you're good to go!\n\n### Future work\nThe library currently has accessors to extract most common types of attribute and components from within them. The next step is exposing the rest of `mwparserfromhell`'s functionality, which includes:\n\n1. More accessors\n2. The ability to modify wikimarkup pages and their component elements;\n3. The ability to write out the resulting, modified markup.\n\nSome time after that the goal is to integrate MediaWiki's actual parser, as a replacement for the `mwparserfromhell` dependency, using [piton](https://github.com/Ironholds/piton).\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fironholds%2Fmwparser","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fironholds%2Fmwparser","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fironholds%2Fmwparser/lists"}