Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.

Awesome Lists | Featured Topics | Projects

https://github.com/maxhalford/bike-sharing-history

๐Ÿšฒ Git scraping for bike sharing APIs
https://github.com/maxhalford/bike-sharing-history

bike-sharing git-scraping

Last synced: 2 months ago
JSON representation

๐Ÿšฒ Git scraping for bike sharing APIs

Awesome Lists containing this project

README

        

# bike-sharing-history

***๐Ÿ“ [See blog post](https://maxhalford.github.io/blog/bike-sharing-forecasting-training-set/)***

This repo tracks the status of bike stations from various bike-sharing providers. The data is fetched every 15 minutes. The results are stored and versioned as [GeoJSON](https://www.wikiwand.com/en/GeoJSON) files. This is done using the [git scraping](https://simonwillison.net/2020/Oct/9/git-scraping/) technique.

The weather forecast for the next 24 hours is also collected every 15 minutes, for each city.

Everyone is welcome to add new cities. You simply have to contribute the necessary details to [`scripts/systems.py`](scripts/systems.py), and then send out a pull request.

## Live data

| # | Country | City | Provider | Stations | Weather |
|---|---------|------|----------|----------|---------|
| 001 | ๐Ÿ‡ฆ๐Ÿ‡ช | Dubai | Careem BIKE | [`dubai/careem-bike.geojson`](data/stations/dubai/careem-bike.geojson) | [`dubai.json`](data/weather/dubai.json) |
| 002 | ๐Ÿ‡ฆ๐Ÿ‡ท | Buenos Aires | Ecobici | [`buenos-aires/ecobici.geojson`](data/stations/buenos-aires/ecobici.geojson) | [`buenos-aires.json`](data/weather/buenos-aires.json) |
| 003 | ๐Ÿ‡ฆ๐Ÿ‡น | Vienna | Nextbike | [`vienna/nextbike.geojson`](data/stations/vienna/nextbike.geojson) | [`vienna.json`](data/weather/vienna.json) |
| 004 | ๐Ÿ‡ง๐Ÿ‡ช | Antwerp | Blue-bike | [`antwerp/blue-bike.geojson`](data/stations/antwerp/blue-bike.geojson) | [`antwerp.json`](data/weather/antwerp.json) |
| 005 | ๐Ÿ‡ง๐Ÿ‡ช | Antwerp | Velo Antwerpen | [`antwerp/velo-antwerpen.geojson`](data/stations/antwerp/velo-antwerpen.geojson) | [`antwerp.json`](data/weather/antwerp.json) |
| 006 | ๐Ÿ‡ง๐Ÿ‡ช | Brussels | JCDecaux | [`brussels/jcdecaux.geojson`](data/stations/brussels/jcdecaux.geojson) | [`brussels.json`](data/weather/brussels.json) |
| 007 | ๐Ÿ‡ง๐Ÿ‡ช | Namur | JCDecaux | [`namur/jcdecaux.geojson`](data/stations/namur/jcdecaux.geojson) | [`namur.json`](data/weather/namur.json) |
| 008 | ๐Ÿ‡ง๐Ÿ‡ท | Porto Alegre | Bike Itaรบ | [`porto-alegre/bike-itau.geojson`](data/stations/porto-alegre/bike-itau.geojson) | [`porto-alegre.json`](data/weather/porto-alegre.json) |
| 009 | ๐Ÿ‡ง๐Ÿ‡ท | Rio de Janeiro | Bike Itaรบ | [`rio-de-janeiro/bike-itau.geojson`](data/stations/rio-de-janeiro/bike-itau.geojson) | [`rio-de-janeiro.json`](data/weather/rio-de-janeiro.json) |
| 010 | ๐Ÿ‡ง๐Ÿ‡ท | Salvador | Bike Itaรบ | [`salvador/bike-itau.geojson`](data/stations/salvador/bike-itau.geojson) | [`salvador.json`](data/weather/salvador.json) |
| 011 | ๐Ÿ‡ง๐Ÿ‡ท | Sampa | Bike Itaรบ | [`sampa/bike-itau.geojson`](data/stations/sampa/bike-itau.geojson) | [`sampa.json`](data/weather/sampa.json) |
| 012 | ๐Ÿ‡จ๐Ÿ‡ฆ | Montrรฉal | BIXI | [`montreal/bixi.geojson`](data/stations/montreal/bixi.geojson) | [`montreal.json`](data/weather/montreal.json) |
| 013 | ๐Ÿ‡จ๐Ÿ‡ฆ | Quรฉbec City | ร Vรฉlo | [`quebec-city/avelo.geojson`](data/stations/quebec-city/avelo.geojson) | [`quebec-city.json`](data/weather/quebec-city.json) |
| 014 | ๐Ÿ‡จ๐Ÿ‡ฆ | Toronto | Bike Share Toronto | [`toronto/bike-share-toronto.geojson`](data/stations/toronto/bike-share-toronto.geojson) | [`toronto.json`](data/weather/toronto.json) |
| 015 | ๐Ÿ‡จ๐Ÿ‡ฆ | Vancouver | Mobi Bike Share | [`vancouver/mobi-bike-share.geojson`](data/stations/vancouver/mobi-bike-share.geojson) | [`vancouver.json`](data/weather/vancouver.json) |
| 016 | ๐Ÿ‡จ๐Ÿ‡ด | Bogotรก | Tembici | [`bogota/tembici.geojson`](data/stations/bogota/tembici.geojson) | [`bogota.json`](data/weather/bogota.json) |
| 017 | ๐Ÿ‡จ๐Ÿ‡ฟ | Brno | Nextbike | [`brno/nextbike.geojson`](data/stations/brno/nextbike.geojson) | [`brno.json`](data/weather/brno.json) |
| 018 | ๐Ÿ‡จ๐Ÿ‡ฟ | Olomouc | Nextbike | [`olomouc/nextbike.geojson`](data/stations/olomouc/nextbike.geojson) | [`olomouc.json`](data/weather/olomouc.json) |
| 019 | ๐Ÿ‡จ๐Ÿ‡ฟ | Ostrava | Nextbike | [`ostrava/nextbike.geojson`](data/stations/ostrava/nextbike.geojson) | [`ostrava.json`](data/weather/ostrava.json) |
| 020 | ๐Ÿ‡จ๐Ÿ‡ฟ | Prague | Nextbike | [`prague/nextbike.geojson`](data/stations/prague/nextbike.geojson) | [`prague.json`](data/weather/prague.json) |
| 021 | ๐Ÿ‡ฉ๐Ÿ‡ช | Berlin | Nextbike | [`berlin/nextbike.geojson`](data/stations/berlin/nextbike.geojson) | [`berlin.json`](data/weather/berlin.json) |
| 022 | ๐Ÿ‡ฉ๐Ÿ‡ช | Dรผsseldorf | Nextbike | [`dusseldorf/nextbike.geojson`](data/stations/dusseldorf/nextbike.geojson) | [`dusseldorf.json`](data/weather/dusseldorf.json) |
| 023 | ๐Ÿ‡ฉ๐Ÿ‡ช | Frankfurt | Nextbike | [`frankfurt/nextbike.geojson`](data/stations/frankfurt/nextbike.geojson) | [`frankfurt.json`](data/weather/frankfurt.json) |
| 024 | ๐Ÿ‡ฉ๐Ÿ‡ช | Freiburg | Frelo Freiburg | [`freiburg/frelo-freiburg.geojson`](data/stations/freiburg/frelo-freiburg.geojson) | [`freiburg.json`](data/weather/freiburg.json) |
| 025 | ๐Ÿ‡ฉ๐Ÿ‡ช | Leipzig | Nextbike | [`leipzig/nextbike.geojson`](data/stations/leipzig/nextbike.geojson) | [`leipzig.json`](data/weather/leipzig.json) |
| 026 | ๐Ÿ‡ช๐Ÿ‡ธ | Barcelona | Bicing | [`barcelona/bicing.geojson`](data/stations/barcelona/bicing.geojson) | [`barcelona.json`](data/weather/barcelona.json) |
| 027 | ๐Ÿ‡ช๐Ÿ‡ธ | Madrid | bicimad | [`madrid/bicimad.geojson`](data/stations/madrid/bicimad.geojson) | [`madrid.json`](data/weather/madrid.json) |
| 028 | ๐Ÿ‡ช๐Ÿ‡ธ | Santander | JCDecaux | [`santander/jcdecaux.geojson`](data/stations/santander/jcdecaux.geojson) | [`santander.json`](data/weather/santander.json) |
| 029 | ๐Ÿ‡ช๐Ÿ‡ธ | Sevilla | JCDecaux | [`sevilla/jcdecaux.geojson`](data/stations/sevilla/jcdecaux.geojson) | [`sevilla.json`](data/weather/sevilla.json) |
| 030 | ๐Ÿ‡ช๐Ÿ‡ธ | Valencia | JCDecaux | [`valencia/jcdecaux.geojson`](data/stations/valencia/jcdecaux.geojson) | [`valencia.json`](data/weather/valencia.json) |
| 031 | ๐Ÿ‡ซ๐Ÿ‡ท | Amiens | JCDecaux | [`amiens/jcdecaux.geojson`](data/stations/amiens/jcdecaux.geojson) | [`amiens.json`](data/weather/amiens.json) |
| 032 | ๐Ÿ‡ซ๐Ÿ‡ท | Besanรงon | JCDecaux | [`besancon/jcdecaux.geojson`](data/stations/besancon/jcdecaux.geojson) | [`besancon.json`](data/weather/besancon.json) |
| 033 | ๐Ÿ‡ซ๐Ÿ‡ท | Cergy-Pontoise | JCDecaux | [`cergy-pontoise/jcdecaux.geojson`](data/stations/cergy-pontoise/jcdecaux.geojson) | [`cergy-pontoise.json`](data/weather/cergy-pontoise.json) |
| 034 | ๐Ÿ‡ซ๐Ÿ‡ท | Clermont-Ferrand | C-Vรฉlo | [`clermont-ferrand/c-velo.geojson`](data/stations/clermont-ferrand/c-velo.geojson) | [`clermont-ferrand.json`](data/weather/clermont-ferrand.json) |
| 035 | ๐Ÿ‡ซ๐Ÿ‡ท | Crรฉteil | JCDecaux | [`creteil/jcdecaux.geojson`](data/stations/creteil/jcdecaux.geojson) | [`creteil.json`](data/weather/creteil.json) |
| 036 | ๐Ÿ‡ซ๐Ÿ‡ท | Lyon | JCDecaux | [`lyon/jcdecaux.geojson`](data/stations/lyon/jcdecaux.geojson) | [`lyon.json`](data/weather/lyon.json) |
| 037 | ๐Ÿ‡ซ๐Ÿ‡ท | Marseille | JCDecaux | [`marseille/jcdecaux.geojson`](data/stations/marseille/jcdecaux.geojson) | [`marseille.json`](data/weather/marseille.json) |
| 038 | ๐Ÿ‡ซ๐Ÿ‡ท | Mulhouse | JCDecaux | [`mulhouse/jcdecaux.geojson`](data/stations/mulhouse/jcdecaux.geojson) | [`mulhouse.json`](data/weather/mulhouse.json) |
| 039 | ๐Ÿ‡ซ๐Ÿ‡ท | Nancy | JCDecaux | [`nancy/jcdecaux.geojson`](data/stations/nancy/jcdecaux.geojson) | [`nancy.json`](data/weather/nancy.json) |
| 040 | ๐Ÿ‡ซ๐Ÿ‡ท | Nantes | JCDecaux | [`nantes/jcdecaux.geojson`](data/stations/nantes/jcdecaux.geojson) | [`nantes.json`](data/weather/nantes.json) |
| 041 | ๐Ÿ‡ซ๐Ÿ‡ท | Paris | Smovengo | [`paris/smovengo.geojson`](data/stations/paris/smovengo.geojson) | [`paris.json`](data/weather/paris.json) |
| 042 | ๐Ÿ‡ซ๐Ÿ‡ท | Toulouse | JCDecaux | [`toulouse/jcdecaux.geojson`](data/stations/toulouse/jcdecaux.geojson) | [`toulouse.json`](data/weather/toulouse.json) |
| 043 | ๐Ÿ‡ญ๐Ÿ‡บ | Budapest | MOL Bubi | [`budapest/mol-bubi.geojson`](data/stations/budapest/mol-bubi.geojson) | [`budapest.json`](data/weather/budapest.json) |
| 044 | ๐Ÿ‡ฎ๐Ÿ‡ช | Dublin | JCDecaux | [`dublin/jcdecaux.geojson`](data/stations/dublin/jcdecaux.geojson) | [`dublin.json`](data/weather/dublin.json) |
| 045 | ๐Ÿ‡ฎ๐Ÿ‡น | Milan | Bikemi | [`milan/bikemi.geojson`](data/stations/milan/bikemi.geojson) | [`milan.json`](data/weather/milan.json) |
| 046 | ๐Ÿ‡ฏ๐Ÿ‡ต | Tokyo | Docomo Bike Sharing | [`tokyo/docomo-bike-sharing.geojson`](data/stations/tokyo/docomo-bike-sharing.geojson) | [`tokyo.json`](data/weather/tokyo.json) |
| 047 | ๐Ÿ‡ฏ๐Ÿ‡ต | Toyama | JCDecaux | [`toyama/jcdecaux.geojson`](data/stations/toyama/jcdecaux.geojson) | [`toyama.json`](data/weather/toyama.json) |
| 048 | ๐Ÿ‡ฑ๐Ÿ‡น | Vilnius | JCDecaux | [`vilnius/jcdecaux.geojson`](data/stations/vilnius/jcdecaux.geojson) | [`vilnius.json`](data/weather/vilnius.json) |
| 049 | ๐Ÿ‡ฑ๐Ÿ‡บ | Luxembourg | JCDecaux | [`luxembourg/jcdecaux.geojson`](data/stations/luxembourg/jcdecaux.geojson) | [`luxembourg.json`](data/weather/luxembourg.json) |
| 050 | ๐Ÿ‡ฒ๐Ÿ‡ฝ | Guadalajara | Mibici | [`guadalajara/mibici.geojson`](data/stations/guadalajara/mibici.geojson) | [`guadalajara.json`](data/weather/guadalajara.json) |
| 051 | ๐Ÿ‡ฒ๐Ÿ‡ฝ | Mexico City | Ecobici | [`mexico-city/ecobici.geojson`](data/stations/mexico-city/ecobici.geojson) | [`mexico-city.json`](data/weather/mexico-city.json) |
| 052 | ๐Ÿ‡ณ๐Ÿ‡ด | Bergen | Bergen Bysykkel | [`bergen/bergen-bysykkel.geojson`](data/stations/bergen/bergen-bysykkel.geojson) | [`bergen.json`](data/weather/bergen.json) |
| 053 | ๐Ÿ‡ณ๐Ÿ‡ด | Lillestrรธm | JCDecaux | [`lillestrom/jcdecaux.geojson`](data/stations/lillestrom/jcdecaux.geojson) | [`lillestrom.json`](data/weather/lillestrom.json) |
| 054 | ๐Ÿ‡ณ๐Ÿ‡ด | Oslo | Oslo Bysykkel | [`oslo/oslo-bysykkel.geojson`](data/stations/oslo/oslo-bysykkel.geojson) | [`oslo.json`](data/weather/oslo.json) |
| 055 | ๐Ÿ‡ณ๐Ÿ‡ด | Stavanger | Kolumbus Bysykkel | [`stavanger/kolumbus-bysykkel.geojson`](data/stations/stavanger/kolumbus-bysykkel.geojson) | [`stavanger.json`](data/weather/stavanger.json) |
| 056 | ๐Ÿ‡ธ๐Ÿ‡ช | Gothenburg | Styr & Stรคll | [`gothenburg/styr--stall.geojson`](data/stations/gothenburg/styr--stall.geojson) | [`gothenburg.json`](data/weather/gothenburg.json) |
| 057 | ๐Ÿ‡ธ๐Ÿ‡ช | Lund | JCDecaux | [`lund/jcdecaux.geojson`](data/stations/lund/jcdecaux.geojson) | [`lund.json`](data/weather/lund.json) |
| 058 | ๐Ÿ‡ธ๐Ÿ‡ฎ | Ljubljana | JCDecaux | [`ljubljana/jcdecaux.geojson`](data/stations/ljubljana/jcdecaux.geojson) | [`ljubljana.json`](data/weather/ljubljana.json) |
| 059 | ๐Ÿ‡ธ๐Ÿ‡ฎ | Maribor | JCDecaux | [`maribor/jcdecaux.geojson`](data/stations/maribor/jcdecaux.geojson) | [`maribor.json`](data/weather/maribor.json) |
| 060 | ๐Ÿ‡บ๐Ÿ‡ธ | Boulder | BCycle | [`boulder/bcycle.geojson`](data/stations/boulder/bcycle.geojson) | [`boulder.json`](data/weather/boulder.json) |
| 061 | ๐Ÿ‡บ๐Ÿ‡ธ | Chattanooga | Bike Chattanooga | [`chattanooga/bike-chattanooga.geojson`](data/stations/chattanooga/bike-chattanooga.geojson) | [`chattanooga.json`](data/weather/chattanooga.json) |
| 062 | ๐Ÿ‡บ๐Ÿ‡ธ | Chicago | Divvy | [`chicago/divvy.geojson`](data/stations/chicago/divvy.geojson) | [`chicago.json`](data/weather/chicago.json) |
| 063 | ๐Ÿ‡บ๐Ÿ‡ธ | Honolulu | Biki | [`honolulu/biki.geojson`](data/stations/honolulu/biki.geojson) | [`honolulu.json`](data/weather/honolulu.json) |
| 064 | ๐Ÿ‡บ๐Ÿ‡ธ | Milwaukee | Bublr Bikes | [`milwaukee/bublr-bikes.geojson`](data/stations/milwaukee/bublr-bikes.geojson) | [`milwaukee.json`](data/weather/milwaukee.json) |
| 065 | ๐Ÿ‡บ๐Ÿ‡ธ | New York City | citibike | [`new-york-city/citibike.geojson`](data/stations/new-york-city/citibike.geojson) | [`new-york-city.json`](data/weather/new-york-city.json) |
| 066 | ๐Ÿ‡บ๐Ÿ‡ธ | Philadelphia | Indego | [`philadelphia/indego.geojson`](data/stations/philadelphia/indego.geojson) | [`philadelphia.json`](data/weather/philadelphia.json) |
| 067 | ๐Ÿ‡บ๐Ÿ‡ธ | San Francisco Bay Area | Bay Wheels | [`san-francisco-bay-area/bay-wheels.geojson`](data/stations/san-francisco-bay-area/bay-wheels.geojson) | [`san-francisco-bay-area.json`](data/weather/san-francisco-bay-area.json) |
| 068 | ๐Ÿ‡บ๐Ÿ‡ธ | Santa Cruz | BCycle | [`santa-cruz/bcycle.geojson`](data/stations/santa-cruz/bcycle.geojson) | [`santa-cruz.json`](data/weather/santa-cruz.json) |
| 069 | ๐Ÿ‡บ๐Ÿ‡ธ | Washington D.C. | Capital Bikeshare | [`washington-d-c/capital-bikeshare.geojson`](data/stations/washington-d-c/capital-bikeshare.geojson) | [`washington-d-c.json`](data/weather/washington-d-c.json) |
| 070 | ๐Ÿด๓ ง๓ ข๓ ฅ๓ ฎ๓ ง๓ ฟ | Brighton | Beryl | [`brighton/beryl.geojson`](data/stations/brighton/beryl.geojson) | [`brighton.json`](data/weather/brighton.json) |
| 071 | ๐Ÿด๓ ง๓ ข๓ ฅ๓ ฎ๓ ง๓ ฟ | Manchester | Beryl | [`manchester/beryl.geojson`](data/stations/manchester/beryl.geojson) | [`manchester.json`](data/weather/manchester.json) |
| 072 | ๐Ÿด๓ ง๓ ข๓ ฅ๓ ฎ๓ ง๓ ฟ | Norwich | Beryl | [`norwich/beryl.geojson`](data/stations/norwich/beryl.geojson) | [`norwich.json`](data/weather/norwich.json) |
| 073 | ๐Ÿด๓ ง๓ ข๓ ฅ๓ ฎ๓ ง๓ ฟ | Plymouth | Beryl | [`plymouth/beryl.geojson`](data/stations/plymouth/beryl.geojson) | [`plymouth.json`](data/weather/plymouth.json) |
| 074 | ๐Ÿด๓ ง๓ ข๓ ฅ๓ ฎ๓ ง๓ ฟ | Portsmouth | Beryl | [`portsmouth/beryl.geojson`](data/stations/portsmouth/beryl.geojson) | [`portsmouth.json`](data/weather/portsmouth.json) |
| 075 | ๐Ÿด๓ ง๓ ข๓ ฅ๓ ฎ๓ ง๓ ฟ | Southampton | Beryl | [`southampton/beryl.geojson`](data/stations/southampton/beryl.geojson) | [`southampton.json`](data/weather/southampton.json) |
| 076 | ๐Ÿด๓ ง๓ ข๓ ณ๓ ฃ๓ ด๓ ฟ | Glasgow | Nextbike | [`glasgow/nextbike.geojson`](data/stations/glasgow/nextbike.geojson) | [`glasgow.json`](data/weather/glasgow.json) |

## Archives

The git history contains the state of each station and weather at several points in time. This git history can be turned into Parquet files for easy consumption. This is done by `archive.py` script. The latter generates Parquet files. These files are stored in a GCP bucket, [here](https://console.cloud.google.com/storage/browser?forceOnBucketsSortingFiltering=true&project=bike-sharing-407017&prefix=&forceOnObjectsSortingFiltering=false).

An easy way to query these files is to use [DuckDB](https://duckdb.org/). The following Python snippet shows how to fetch the all bike station updates for the city of Toulouse:

```py
import duckdb

with duckdb.connect(":memory:") as con:
con.execute("SET s3_endpoint='storage.googleapis.com'")
updates = con.execute(f"""
SELECT *
FROM READ_PARQUET('s3://bike-sharing-history/toulouse/jcdecaux/*/*.parquet');
""").fetch_df()
```

And here's a snippet to fetch the 24 hour weather forecast at different points in time for the city of Toulouse:

```py
with duckdb.connect(":memory:") as con:
con.execute("SET s3_endpoint='storage.googleapis.com'")
weather = con.execute(f"""
SELECT *
FROM READ_PARQUET('s3://weather-forecast-history/toulouse/*/*.parquet');
""").fetch_df()
```

If these exports are not adapted to your needs, feel welcome to reach out. The exports can be easily adapted to different needs, because the source of truth is the git history.