{"id":21685878,"url":"https://github.com/sodascience/artscraper","last_synced_at":"2025-04-12T08:13:26.175Z","repository":{"id":82756083,"uuid":"472750211","full_name":"sodascience/artscraper","owner":"sodascience","description":"Python package for downloading art and metadata of WikiArt and Google Arts \u0026 Culture","archived":false,"fork":false,"pushed_at":"2024-04-15T09:41:33.000Z","size":1231,"stargazers_count":9,"open_issues_count":5,"forks_count":1,"subscribers_count":2,"default_branch":"main","last_synced_at":"2025-04-12T08:13:20.571Z","etag":null,"topics":["art","download","google-arts-and-culture","odissei","wikiart"],"latest_commit_sha":null,"homepage":"","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/sodascience.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":"CONTRIBUTING.md","funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null}},"created_at":"2022-03-22T12:07:08.000Z","updated_at":"2024-05-31T21:57:34.000Z","dependencies_parsed_at":null,"dependency_job_id":"ec74866d-0ccd-4343-b5af-37c2cd67bf92","html_url":"https://github.com/sodascience/artscraper","commit_stats":{"total_commits":41,"total_committers":3,"mean_commits":"13.666666666666666","dds":"0.41463414634146345","last_synced_commit":"63a9d1871b33de5e828df677343dc615658e2adc"},"previous_names":[],"tags_count":5,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/sodascience%2Fartscraper","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/sodascience%2Fartscraper/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/sodascience%2Fartscraper/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/sodascience%2Fartscraper/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/sodascience","download_url":"https://codeload.github.com/sodascience/artscraper/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":248537144,"owners_count":21120711,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["art","download","google-arts-and-culture","odissei","wikiart"],"created_at":"2024-11-25T16:23:30.567Z","updated_at":"2025-04-12T08:13:26.154Z","avatar_url":"https://github.com/sodascience.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"\n# ArtScraper\n\nArtScraper is a tool to download images and metadata for artworks available on\nWikiArt (www.wikiart.org/) and Google Arts \u0026 Culture\n(artsandculture.google.com/).\n\nFunctionality:\n- `WikiArt` and `Google Arts \u0026 Culture`: Download images and metadata from a list of artworks' urls\n- `Google Arts \u0026 Culture`: Download all images and metadata in the site, or from specific artists\n\n## 1. Installation and setup\n\nThe ArtScraper package can be installed with pip, which automatically installs\nthe python dependencies:\n\n```\npip install artscraper\n```\n\n\n## 2. Downloading art from WikiArt\n\nTo download data from WikiArt it is necessary to obtain\n[API](https://www.wikiart.org/en/App/GetApi) keys. After obtaining them, you\ncan put them in a file called `.wiki_api` in the working directory for your\nscript. The format is: the API access key, a new line, the API secret key, and\na new line, e.g.:\n\n```\n7e57a60844\n3defc62d8f\n```\n\nAlternatively, when ArtScraper doesn't detect the file `.wiki_api`, it will\nask for the API keys.\n\nAn example of fetching data is shown below and in the [notebook](examples/example_artscraper.ipynb). \n\n```python\n\nfrom artscraper import WikiArtScraper\n\nart_url = \"https://www.wikiart.org/en/edvard-munch/anxiety-1894\"\n\nwith WikiArtScraper(output_dir=\"data\") as scraper:\n    scraper.load_link(art_url)\n    scraper.save_metadata() \n    scraper.save_image()\n\n```\n\nThis will store both the image itself and the metadata in separate folders. If\nyou use ArtScraper in this way, it will skip images/metadata that is already\npresent. Remove the directory to force it to redownload it. \n\nResults:\n\n[\u003cimg src=\"https://uploads5.wikiart.org/images/edvard-munch/anxiety-1894.jpg\" weight=\"20\"\u003e](https://www.wikiart.org/en/edvard-munch/anxiety-1894)\n\n\n## 3. Downloading art from Google Arts \u0026 Culture\n\nTo download data from GoogleArt it is necessary to install \n[Firefox](https://www.mozilla.org/en-US/firefox/new/).\n\nArtScraper will open a new Firefox window, navigate to the image, zoom on it and take a screenshot of it. It will take a few seconds. Do not minimize that browser, and do not let the screensaver go on.\n\n\n### 3.1 Downloading art from Google Arts \u0026 Culture using artworks' urls\n\nAn example of fetching data is shown below and in the [notebook](examples/example_artscraper.ipynb). \n\n```python\nfrom artscraper import GoogleArtScraper\n\nart_url = \"https://artsandculture.google.com/asset/anxiety-edvard-munch/JgE_nwHHS7wTPw\"\n\nwith GoogleArtScraper() as scraper:\n    scraper.load_link(art_url)\n    metadata = scraper.get_metadata() #or scraper.save_metadata()\n    scraper.save_image(\"data/anxiety_munch.jpg\")\n    print(metadata) \n\n```\n\n\n### 3.2 Downloading all art from Google Arts \u0026 Culture \n\nSee [example notebook](examples/example_collect_all_artworks.ipynb).\n\nThe final structure of the results will be\n- data\n  - artist_links.txt (All artists, with one url per line) \n  - Artist_1\n    - description.txt (Description of artist, from wikidata)\n    - metadata.json (Metadata of arist, from wikidata)\n    - works.txt (All artworks, with one url per line)\n    - works \n      - work1\n        - artwork.png (Artwork image)\n        - metadata.json (Metadata of artwork, from Google Art and Culture)\n      - work2\n        - ...\n  - Artist_2\n    - ... \n\n\nA full example (but please check the [example notebook](examples/example_collect_all_artworks.ipynb) to add retries):\n\n```python\nfrom artscraper.find_artists import get_artist_links\n# Get links for all artists, as a list\noutput_dir = \"data\"\nartist_urls = get_artist_links(min_wait_time=1, output_file=f'{output_dir}/artist_links.txt')\n\n# Find_artworks for each artist\nfor artist_url in artist_urls:\n    with FindArtworks(artist_link=artist_url, output_dir=output_dir, \n                      min_wait_time=min_wait_time) as scraper:\n            # Save list of artworks, the description, and metadata for an artist\n            scraper.save_artist_information()\n            # Find artist directory\n            artist_dir = output_dir + '/' + scraper.get_artist_name() \n\n    # Scrape artworks\n    with GoogleArtScraper(artist_dir + '/' + 'works', min_wait=min_wait_time) as subscraper:\n        # Get list of links to this artist's works \n        with open(artist_dir+'/'+'works.txt', 'r') as file:\n            artwork_links = [line.rstrip() for line in file]  \n        # Download all artwork link (slow)\n        for url in artwork_links:\n            print(f'artwork URL: {url}')\n            subscraper.save_artwork_information()\n```\n\n\n## Contributing\n\nContributions are what make the open source community an amazing place\nto learn, inspire, and create. Any contributions you make are **greatly\nappreciated**.\n\nPlease refer to the\n[CONTRIBUTING](https://github.com/sodascience/artscraper/blob/main/CONTRIBUTING.md)\nfile for more information on issues and pull requests.\n\n## License and citation\n\nThe package `artscraper` is published under an MIT license. When using `artscraper` for academic work, please cite:\n\n    Schram, Raoul, Mitra, Modhurita, Garcia-Bernardo, Javier, van Kesteren, Erik-Jan, de Bruin, Jonathan, \u0026 Stamkou, Eftychia. (2022). \n    ArtScraper: A Python library to scrape online artworks (0.1.1). Zenodo. https://doi.org/10.5281/zenodo.7129975\n\n\n## Contact\n\nThis project is developed and maintained by the [ODISSEI Social Data\nScience (SoDa)](https://odissei-data.nl/nl/soda/) team.\n\n\u003cimg src=\"soda_logo.png\" alt=\"SoDa logo\" width=\"250px\"/\u003e\n\nDo you have questions, suggestions, or remarks? File an issue in the issue\ntracker or feel free to contact the team via\nhttps://odissei-data.nl/en/using-soda/.\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fsodascience%2Fartscraper","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fsodascience%2Fartscraper","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fsodascience%2Fartscraper/lists"}