https://github.com/jerboaburrow/getimagesfromwikimediawithattributions
Extract images form a query to Wikimedia, with attributions
https://github.com/jerboaburrow/getimagesfromwikimediawithattributions
downloader images web web-scraping wikimedia wikimedia-commons wikipedia
Last synced: about 1 year ago
JSON representation
Extract images form a query to Wikimedia, with attributions
- Host: GitHub
- URL: https://github.com/jerboaburrow/getimagesfromwikimediawithattributions
- Owner: JerboaBurrow
- License: gpl-3.0
- Created: 2024-09-14T08:53:13.000Z (almost 2 years ago)
- Default Branch: main
- Last Pushed: 2024-09-14T10:12:20.000Z (almost 2 years ago)
- Last Synced: 2025-02-04T15:44:36.676Z (over 1 year ago)
- Topics: downloader, images, web, web-scraping, wikimedia, wikimedia-commons, wikipedia
- Language: Python
- Homepage:
- Size: 22.5 KB
- Stars: 0
- Watchers: 0
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
### Usage
To get 3 images of cheese,
```bash
python3 script/get_images_from_wikimedia.py cheese -take 3
```
will create a directory ```images/cheese``` with the image files below
| 1.jpg | 2.jpg | 3.jpg |
| :---: | :---: | :---: |
|  | ||
and the ```attributions.json```,
```json
{
"1": {
"Attribution": "",
"LicenseShortName": "Public domain",
"LicenseUrl": "",
"Artist": "Clara Peeters\n"
},
"2": {
"Attribution": "",
"LicenseShortName": "Public domain",
"LicenseUrl": "",
"Artist": "Original photo by John Sullivan. Edit own work."
},
"3": {
"Attribution": "This file is not in the public domain. Therefore you are requested to use the following next to the image if you reuse this file: \u00a9 Yann Forget\u00a0/\u00a0Wikimedia Commons",
"LicenseShortName": "CC BY-SA 4.0",
"LicenseUrl": "https://creativecommons.org/licenses/by-sa/4.0",
"Artist": "Yann Forget"
}
}
```
### Rights
- (Output) Files downloaded are subject to any license listed in the ```{path}/{query}/attributions.json``` file.
- The source code is GPLv3