{"id":14037052,"url":"https://github.com/alltheplaces/alltheplaces","last_synced_at":"2025-07-27T04:34:10.233Z","repository":{"id":37270505,"uuid":"61166935","full_name":"alltheplaces/alltheplaces","owner":"alltheplaces","description":"A set of spiders and scrapers to extract location information from places that post their location on the internet.","archived":false,"fork":false,"pushed_at":"2024-10-29T14:24:30.000Z","size":29983,"stargazers_count":624,"open_issues_count":791,"forks_count":213,"subscribers_count":27,"default_branch":"master","last_synced_at":"2024-10-29T17:27:48.363Z","etag":null,"topics":["geojson","hacktoberfest","python","scrapers","scrapy","spider"],"latest_commit_sha":null,"homepage":"https://www.alltheplaces.xyz","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"other","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/alltheplaces.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":"CODE_OF_CONDUCT.md","threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2016-06-15T01:09:18.000Z","updated_at":"2024-10-29T14:24:34.000Z","dependencies_parsed_at":"2023-09-23T19:49:14.213Z","dependency_job_id":"3596ea95-0d68-43d0-aa43-70ef8a45edfc","html_url":"https://github.com/alltheplaces/alltheplaces","commit_stats":null,"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/alltheplaces%2Falltheplaces","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/alltheplaces%2Falltheplaces/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/alltheplaces%2Falltheplaces/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/alltheplaces%2Falltheplaces/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/alltheplaces","download_url":"https://codeload.github.com/alltheplaces/alltheplaces/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":227762440,"owners_count":17816026,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["geojson","hacktoberfest","python","scrapers","scrapy","spider"],"created_at":"2024-08-12T03:02:26.643Z","updated_at":"2025-07-27T04:34:10.227Z","avatar_url":"https://github.com/alltheplaces.png","language":"Python","readme":"# All the Places\n\nA project to generate [point of interest (POI)](https://en.wikipedia.org/wiki/Point_of_interest) data sourced [from websites](docs/WHY_SPIDER.md) with 'store location' pages. The project uses [`scrapy`](https://scrapy.org/), a popular Python-based web scraping framework, to execute individual site [spiders](https://doc.scrapy.org/en/latest/topics/spiders.html) that retrieve POI data, publishing the results in a [standard format](DATA_FORMAT.md). There are various `scrapy` tutorials on the Internet and [this series on YouTube](https://www.youtube.com/watch?v=s4jtkzHhLzY) is reasonable.\n\n## Getting started\n\n### Development setup\n\nWindows users may need to follow some extra steps, please follow the [scrapy docs](https://docs.scrapy.org/en/latest/intro/install.html#windows) for up to date details.\n\n#### Ubuntu\n\nThese instructions were tested with Ubuntu 24.04 LTS on 2024-02-21.\n\n1. Install `uv`:\n\n   ```\n   curl -LsSf https://astral.sh/uv/install.sh | sh\n   source $HOME/.local/bin/env\n   ```\n\n1. Clone a copy of the project from the [All the Places](https://github.com/alltheplaces/alltheplaces/) repo (or your own fork if you are considering contributing to the project):\n\n   ```\n   git clone git@github.com:alltheplaces/alltheplaces.git\n   ```\n\n1. Use `uv` to install the project dependencies:\n\n   ```\n   cd alltheplaces\n   uv sync\n   ```\n\n1. Test for successful project installation:\n\n   ```\n   uv run scrapy\n   ```\n\n   If the above runs without complaint, then you have a functional installation and are ready to run and write spiders.\n\n#### macOS\n\nThese instructions were tested with macOS 15.3.2 on 2025-04-01.\n\n1. Install `uv`:\n\n   ```\n   brew install uv\n   ```\n\n1. Clone a copy of the project from the [All the Places](https://github.com/alltheplaces/alltheplaces/) repo (or your own fork if you are considering contributing to the project):\n\n   ```\n   git clone git@github.com:alltheplaces/alltheplaces.git\n   ```\n\n1. Use `uv` to install the project dependencies:\n\n   ```\n   cd alltheplaces\n   uv sync\n   ```\n\n1. Test for successful project installation:\n\n   ```\n   uv run scrapy\n   ```\n\n   If the above runs without complaint, then you have a functional installation and are ready to run and write spiders.\n\n#### Codespaces\n\nYou can use GitHub Codespaces to run the project. This is a cloud-based development environment that is created from the project's repository and includes a pre-configured environment with all the tools you need to develop the project. To use Codespaces, click the button below:\n\n   [![Open in GitHub Codespaces](https://github.com/codespaces/badge.svg)](https://codespaces.new/alltheplaces/alltheplaces)\n\n#### Docker\n\nYou can use Docker to run the project. This is a container-based development environment that is created from the project's repository and includes a pre-configured environment with all the tools you need to develop the project.\n\n1. Clone a copy of the project from the [All the Places](https://github.com/alltheplaces/alltheplaces/) repo (or your own fork if you are considering contributing to the project):\n\n   ```\n   git clone git@github.com:alltheplaces/alltheplaces.git\n   ```\n\n1. Build the Docker image:\n\n   ```\n   cd alltheplaces\n   docker build -t alltheplaces .\n   ```\n\n1. Run the Docker container:\n\n   ```\n   docker run --rm -it alltheplaces\n   ```\n\n### Contributing code\n\nMany of the sites provide their data in a [standard format](docs/STRUCTURED_DATA.md). Others export their data [via simple APIs](docs/API_SPIDER.md). We have a number of guides to help you develop spiders:\n\n* [What should I call my spider?](docs/SPIDER_NAMING.md)\n* [Using Wikidata and the Name Suggestion Index](docs/WIKIDATA.md)\n* [Sitemaps make finding POI pages easier](docs/SITEMAP.md)\n* [Data from many POI pages can be extracted without writing code](docs/STRUCTURED_DATA.md)\n* [What is expected in a pull request?](docs/PULL_REQUEST.md)\n* [What we do behind the scenes](docs/PIPELINES.md)\n\n### The weekly run\n\nThe output from running the project is [published on a regular cadence](docs/WEEKLY_RUN.md) to our website: [alltheplaces.xyz](https://www.alltheplaces.xyz/). You should not run all the spiders to pick up the output: the less the project \"bothers\" a website the more we will be tolerated.\n\n## Contact us\n\nCommunication is primarily through tickets on the project GitHub [issue tracker](https://github.com/alltheplaces/alltheplaces/issues). Many contributors are also present on [OSM US Slack](https://slack.openstreetmap.us/), which has an [#alltheplaces](https://osmus.slack.com/archives/C07EY4Y3M6F) channel.\n\n## License\n\nThe data generated by our spiders is provided [on our website](https://alltheplaces.xyz/) and released under [Creative Commons’ CC-0 waiver](https://creativecommons.org/publicdomain/zero/1.0/).\n\nThe [spider software that produces this data](https://github.com/alltheplaces/alltheplaces) (this repository) is licensed under the [MIT license](https://github.com/alltheplaces/alltheplaces/blob/master/LICENSE).\n","funding_links":[],"categories":["Python"],"sub_categories":[],"project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Falltheplaces%2Falltheplaces","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Falltheplaces%2Falltheplaces","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Falltheplaces%2Falltheplaces/lists"}