{"id":23201214,"url":"https://github.com/codedotjs/moma","last_synced_at":"2025-06-16T02:03:18.113Z","repository":{"id":179563466,"uuid":"662744704","full_name":"CodeDotJS/MoMA","owner":"CodeDotJS","description":"Museum of Modern Art Dataset  ","archived":false,"fork":false,"pushed_at":"2023-07-26T17:30:19.000Z","size":73986,"stargazers_count":4,"open_issues_count":0,"forks_count":0,"subscribers_count":1,"default_branch":"master","last_synced_at":"2025-04-05T09:30:08.950Z","etag":null,"topics":["api","art","artists","artwork-collection","collection","csv","dataset","flask","json","moma","moma-museum","python","scraper"],"latest_commit_sha":null,"homepage":"https://moma.org/collection","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/CodeDotJS.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null}},"created_at":"2023-07-05T19:45:48.000Z","updated_at":"2025-02-08T14:10:48.000Z","dependencies_parsed_at":"2023-11-23T07:45:47.215Z","dependency_job_id":null,"html_url":"https://github.com/CodeDotJS/MoMA","commit_stats":{"total_commits":95,"total_committers":1,"mean_commits":95.0,"dds":0.0,"last_synced_commit":"123f3f81a270dabae130043ecf611dbab299388a"},"previous_names":["codedotjs/moma"],"tags_count":0,"template":false,"template_full_name":null,"purl":"pkg:github/CodeDotJS/MoMA","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/CodeDotJS%2FMoMA","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/CodeDotJS%2FMoMA/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/CodeDotJS%2FMoMA/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/CodeDotJS%2FMoMA/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/CodeDotJS","download_url":"https://codeload.github.com/CodeDotJS/MoMA/tar.gz/refs/heads/master","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/CodeDotJS%2FMoMA/sbom","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":260083839,"owners_count":22956407,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["api","art","artists","artwork-collection","collection","csv","dataset","flask","json","moma","moma-museum","python","scraper"],"created_at":"2024-12-18T15:14:28.239Z","updated_at":"2025-06-16T02:03:18.072Z","avatar_url":"https://github.com/CodeDotJS.png","language":"Python","readme":"\u003ch1 align=\"center\"\u003e\u003cimg src=\"media/moma.png\"\u003e\u003c/h1\u003e\n\n## Description\n\nMoMA Artworks Scraper and Dataset Builder is designed to collect and organize data about artworks and artists from the Museum of Modern Art (MoMA). This project gathers essential details about Artworks and Artists. It then saves the collected data in both JSON and CSV formats, making it easy to use for further analysis and research.\n\n__Artworks Data__\n\nThe [artworks.json](artworks.json) file contains a vast collection of __100,110 artworks__ from the Museum of Modern Art (MoMA). Each artwork entry is represented as a JSON object with comprehensive details, including the artist's name, title, year, medium, dimensions, publisher, edition, credit, object number, copyright, portfolio, department, and more.\n\n__Example__\n\n```JSON\n{\n    \"Artist\": \"Frida Kahlo\",\n    \"Title\": \"My Grandparents, My Parents, and I (Family Tree)\",\n    \"Year\": \"1936\",\n    \"ObjectID\": 78784,\n    \"Work\": \"https://www.moma.org/collection/works/78784\",\n    \"Thumbnail\": \"https://www.moma.org/media/W1siZiIsIjQ3N...M2UiXV0.jpg?sha=c411357c15216300\",\n    \"Details\": {\n        \"Medium\": \"Oil and tempera on zinc\",\n        \"Dimensions\": \"12 1/8 x 13 5/8\\\" (30.7 x 34.5 cm)\",\n        \"Credit\": \"Gift of Allan Roos, M. D., and B. Mathieu Roos\",\n        \"Object number\": \"102.1976\",\n        \"Copyright\": \"\\u00a9 2023 Banco de M\\u00e9xico Diego Rivera Frida Kahlo Museums Trust, Mexico, D.F. / Artists Rights Society (ARS), New York\",\n        \"Department\": \"Painting and Sculpture\"\n    },\n    \"Profile\": \"https://www.moma.org/artists/2963\"\n}\n ```\n\n__Artists Data__\n\nThe [artists.json](artists.json) file is a dataset containing information about various artists represented in the Museum of Modern Art. With __27,385 entries__, each artist's data is represented as a JSON object, providing details such as the artist's page URL, ID, name, and bio. Additionally, the dataset includes extended details (if available on MoMA's site) about each artist, including an introduction, Wikidata ID, nationality, gender, roles, alternative names, Ulan ID, and more.\n\n__Example__\n\n```JSON\n{\n    \"page\": \"https://www.moma.org/artists/2963\",\n    \"ID\": 2963,\n    \"name\": \"Frida Kahlo\",\n    \"bio\": \"Mexican, 1907–1954\",\n    \"details\": {\n        \"Introduction\": \"Mexican fantasy painter known as much for her turbulent personal life as her fanciful self-portraits. ... Her work received notoriety in the 1970's, becoming popular with feminist art historians and Latin Americans living in the United States.\",\n        \"Wikidata\": \"Q5588\",\n        \"Nationality\": \"Mexican\",\n        \"Gender\": \"Female\",\n        \"Roles\": \"Artist, Painter\",\n        \"Names\": \"Frida Kahlo, Frida Kahlo de Rivera, Frida Rivera, De Rivera Kahlo.. Frida Rivera-Kahlo\",\n        \"Ulan\": \"500030701\"\n    }\n}\n\n```\n## Motivation\n\nThe primary motivation behind this project is to build an extensive dataset of MoMA's artists and art collection and create an accessible API. As the original API is restricted to MoMA staff and partners, this project seeks to provide a publicly available alternative.\n\n## Features\n\n- Uses concurrent/asynchronous programming for faster data collection and processing.\n- Gathers Artwork information from  more than __100k pages__ in MoMA's collection.\n- Gathers Artist information from  more than __27k pages__ in MoMA's collection.\n- Saves and extends the artwork/artists data in JSON format for each page.\n- Sorts and converts the collected data to CSV for easier analysis.\n\n## Build\n\nBuilding datasets for both Artists and Artworks requires different scripts. The complete steps are mentioned __[here](docs/workground.md)__.\n\n## Contributing\n\nIf you're interested in contributing, you can start by forking the repository. After that, create a separate branch to work on your changes, and once you're done, submit a pull request with your modifications.\n\n## Future Work\n\n__Build__\n\n- Flask-based API for MoMA Artworks/Artists dataset.\n- Automation tools for dataset's monthly checkups and updates.\n\n__Add__\n\n- In MoMA's original dataset, images are not included, but I'd like to further extend and add images as well. The task is fairly simple, so the changes will be pushed soon.\n\n__Improvements__\n\n- Fix redundancies in various scripts.\n- Strict structure for JSON datasets.\n\n## License\n\nMIT License \u0026copy; Rishi Giri\n","funding_links":[],"categories":[],"sub_categories":[],"project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fcodedotjs%2Fmoma","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fcodedotjs%2Fmoma","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fcodedotjs%2Fmoma/lists"}