{"id":19703474,"url":"https://github.com/openrefine/commonsextension","last_synced_at":"2025-04-29T14:30:44.951Z","repository":{"id":37796799,"uuid":"452775514","full_name":"OpenRefine/CommonsExtension","owner":"OpenRefine","description":"An OpenRefine extension that helps with Wikimedia Commons editing: start projects from Wikimedia Commons categories; Commons-specific GREL functions.","archived":false,"fork":false,"pushed_at":"2025-03-31T09:02:03.000Z","size":351,"stargazers_count":16,"open_issues_count":33,"forks_count":10,"subscribers_count":25,"default_branch":"master","last_synced_at":"2025-04-05T17:51:08.568Z","etag":null,"topics":["extension","java","openrefine","sdc","wikicommons","wikimedia"],"latest_commit_sha":null,"homepage":"","language":"Java","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"bsd-3-clause","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/OpenRefine.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":".github/FUNDING.yml","license":"LICENSE.txt","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null},"funding":{"github":["openrefine"],"patreon":null,"open_collective":null,"ko_fi":null,"tidelift":null,"community_bridge":null,"liberapay":null,"issuehunt":null,"otechie":null,"lfx_crowdfunding":null,"custom":["https://donorbox.org/open-refine"]}},"created_at":"2022-01-27T17:18:34.000Z","updated_at":"2025-03-31T09:02:00.000Z","dependencies_parsed_at":"2024-12-26T09:19:08.972Z","dependency_job_id":"96f7ea62-a67c-46ee-a8ee-e273403eabdb","html_url":"https://github.com/OpenRefine/CommonsExtension","commit_stats":null,"previous_names":[],"tags_count":6,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/OpenRefine%2FCommonsExtension","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/OpenRefine%2FCommonsExtension/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/OpenRefine%2FCommonsExtension/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/OpenRefine%2FCommonsExtension/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/OpenRefine","download_url":"https://codeload.github.com/OpenRefine/CommonsExtension/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":251518787,"owners_count":21602211,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["extension","java","openrefine","sdc","wikicommons","wikimedia"],"created_at":"2024-11-11T21:17:58.972Z","updated_at":"2025-04-29T14:30:42.748Z","avatar_url":"https://github.com/OpenRefine.png","language":"Java","funding_links":["https://github.com/sponsors/openrefine","https://donorbox.org/open-refine"],"categories":[],"sub_categories":[],"readme":"# Wikimedia Commons Extension for OpenRefine\n\u003cimg align=\"right\" width=\"160\" src=\"https://upload.wikimedia.org/wikipedia/commons/4/4a/Commons-logo.svg\"\u003e\n\nThis extension provides several helpful functionalities for OpenRefine users who want to edit (structured data of) **media files** (images, videos, PDFs...) on **[Wikimedia Commons](https://commons.wikimedia.org)**. For more info, documentation and how-tos about OpenRefine for Wikimedia Commons, see **https://commons.wikimedia.org/wiki/Commons:OpenRefine**.\n\nFeatures included in this extension:\n* Start an OpenRefine project by loading file names from one or more **Wikimedia Commons categories** (including category depth)\n* Add **columns** with Commons categories and/or M-ids of each file name\n* File names will already be **reconciled** when starting the project\n* A few dedicated **GREL commands** allow basic processing and extraction of Wikitext: `extractFromTemplate` and `value.extractCategories`\n* (In this extension's 0.1.1 release and later) Basic support for **file thumbnail previews** of existing Wikimedia Commons files. Thumbnails are displayed for some (but not all) file types/extensions. There is currently thumbnail support for jpeg, gif, png, djvu, pdf, svg, webm and ogv files.\n\nIt works with **OpenRefine 3.6.x and later versions of OpenRefine**. It is not compatible with OpenRefine 3.5.x or earlier. *(OpenRefine supports editing Wikimedia Commons from version 3.6; this is not possible in earlier versions.)*\n\n*This extension was first released in October 2022. It has been funded by a [Wikimedia project grant](https://meta.wikimedia.org/wiki/Grants:Project/CS%26S/Structured_Data_on_Wikimedia_Commons_functionalities_in_OpenRefine).*\n\n## How to use this extension\n\n### Install this extension in OpenRefine\n\nDownload the .zip file of the [latest release of this extension](https://github.com/OpenRefine/CommonsExtension/releases).\nUnzip this file and place the unzipped folder in your OpenRefine extensions folder. [Read more about installing extensions in OpenRefine's user manual](https://docs.openrefine.org/manual/installing#installing-extensions).\n\n\u003cimg width=\"600\" src=\"https://upload.wikimedia.org/wikipedia/commons/2/26/OpenRefine_-_Commons_Extension_-_location_to_install.png\"\u003e\n\nWhen this extension is installed correctly, you will now see the additional option 'Wikimedia Commons' when starting a new project in OpenRefine. \n\n### Start an OpenRefine project from one or more Wikimedia Commons categories\n\nAfter installing this extension, click the 'Wikimedia Commons' option to start a new project in OpenRefine. You will be prompted to add one or more [Wikimedia Commons categories](https://commons.wikimedia.org/wiki/Commons:Categories). \n\n\u003cimg src=\"https://upload.wikimedia.org/wikipedia/commons/5/53/OpenRefine_-_Commons_Extension_-_start_project_from_categories.png\"\u003e\n\nThere's no need to type the Category: prefix. \n\nYou can specify category depth by typing or selecting a number in the input field after each category. Depth `0` means only files from the current category level; depth `1` will retrieve files from one sub-category level down, etc.\n\nNext, in the project preview screen (`Configure parsing options`), you can choose to also include a column with each file's M-id (unique [MediaInfo identifier](https://www.mediawiki.org/wiki/Extension:WikibaseMediaInfo#MediaInfo_Entity)) and/or Commons categories.\n\nFile names will already be reconciled when your project starts.\n\nWhen you load larger categories (thousands of files) in a new project, OpenRefine will start slowly and will give you a memory warning. [This is a known issue](https://github.com/OpenRefine/CommonsExtension/issues/72). Wait for a bit; the project will eventually start. The Commons Extension has been tested with a project of more than 450,000 files.\n\n### GREL commands to extract data from Wikitext\n\nThe Wikimedia Commons Extension also enables two dedicated GREL commands, which help to extract specific information from the Wikitext of Wikimedia Commons files. *(GREL, General Refine Expression Language, is a dedicated scripting language used in OpenRefine for many flexible data operations. For a general reference on using GREL in OpenRefine, see https://docs.openrefine.org/manual/grelfunctions.)*\n \nFirstly, retrieve the Wikitext from a list of Commons files in your project. In the column menu of the reconciled file names' column, select `Edit column` \u003e `Add column from reconciled values...` and select `Wikitext` in the resulting dialog window.\n\nFrom this new column with Wikitext, you can now extract values and categories as described below. Start by selecting `Edit column` \u003e `Add column based on this column...` in the column menu. In the next dialog window, you can use various specific GREL commands:\n\n#### Extract values from template parameters: `extractFromTemplate`\n\n\u003cimg width=\"600\" src=\"https://upload.wikimedia.org/wikipedia/commons/b/be/OpenRefine_-_Commons_Extension_-_GREL_extractFromTemplate.png\"\u003e\n\nUse the following syntax:\n\n```\nextractFromTemplate(value, \"BHL\", \"source\")[0]\n```\n\nwhere you replace `BHL` with the name of the template (without curly brackets) and `source` with the parameter from which you want to extract the value. This GREL syntax will return the first (and usually the only) value of said parameter, e.g. `https://www.flickr.com/photos/biodivlibrary/10329116385`.\n\n#### Extract Wikimedia Commons categories: `value.extractCategories`\n\n\u003cimg width=\"600\" src=\"https://upload.wikimedia.org/wikipedia/commons/0/0d/OpenRefine_-_Commons_Extension_-_GREL_value.extractCategories.png\"\u003e\n\nUse the following syntax:\n\n```\nvalue.extractCategories().join('#')\n```\n\nThis GREL syntax will return all categories mentioned in the Wikitext, separated by the `#` character, which you can then use to split the resulting cell further as needed.\n\n## Development\n\n### Building from source\n\nRun     \n```\nmvn package\n```\n\nThis creates a zip file in the `target` folder, which can then be [installed in OpenRefine](https://docs.openrefine.org/manual/installing#installing-extensions).\n\n### Developing it\n\nTo avoid having to unzip the extension in the corresponding directory every time you want to test it, you can also use another set up: simply create a symbolic link from your extensions folder in OpenRefine to the local copy of this repository. With this setup, you do not need to run `mvn package` when making changes to the extension, but you will still to compile it with `mvn compile` if you are making changes to Java files, and restart OpenRefine if you make changes to any files.\n\n### Releasing it\n\n- Make sure you are on the `master` branch and it is up to date (`git pull`)\n- Open `pom.xml` and set the version to the desired version number, such as `\u003cversion\u003e0.1.0\u003c/version\u003e`\n- Commit and push those changes to master\n- Add a corresponding git tag, with `git tag -a v0.1.0 -m \"Version 0.1.0\"` (when working from GitHub Desktop, you can follow [this process](https://docs.github.com/en/desktop/contributing-and-collaborating-using-github-desktop/managing-commits/managing-tags) and manually add the `v0.1.0` tag with the description `Version 0.1.0`)\n- Push the tag to GitHub: `git push --tags` (in GitHub Desktop, just push again)\n- Create a new release on GitHub at https://github.com/OpenRefine/CommonsExtension/releases/new, providing a release title (such as \"Commons extension 0.1.0\") and a description of the features in this release.\n- Open `pom.xml` and set the version to the expected next version number, followed by `-SNAPSHOT`. For instance, if you just released 0.1.0, you could set `\u003cversion\u003e0.1.1-SNAPSHOT\u003c/version\u003e`\n- Commit and push those changes.\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fopenrefine%2Fcommonsextension","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fopenrefine%2Fcommonsextension","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fopenrefine%2Fcommonsextension/lists"}