{"id":16679572,"url":"https://github.com/mfelsche/dwd_weather_data","last_synced_at":"2025-10-05T01:39:31.476Z","repository":{"id":34245977,"uuid":"38125938","full_name":"mfelsche/dwd_weather_data","owner":"mfelsche","description":"A command line tool to download german weather data from the DWD CDC FTP server","archived":false,"fork":false,"pushed_at":"2015-09-19T20:01:05.000Z","size":224,"stargazers_count":2,"open_issues_count":0,"forks_count":0,"subscribers_count":3,"default_branch":"master","last_synced_at":"2025-03-13T08:16:11.554Z","etag":null,"topics":[],"latest_commit_sha":null,"homepage":null,"language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":"adafruit/Adafruit_AM2315","license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/mfelsche.png","metadata":{"files":{"readme":"README.rst","changelog":"CHANGES.rst","contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null}},"created_at":"2015-06-26T18:02:12.000Z","updated_at":"2023-01-28T16:03:21.000Z","dependencies_parsed_at":"2022-09-14T02:30:58.758Z","dependency_job_id":null,"html_url":"https://github.com/mfelsche/dwd_weather_data","commit_stats":null,"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"purl":"pkg:github/mfelsche/dwd_weather_data","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/mfelsche%2Fdwd_weather_data","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/mfelsche%2Fdwd_weather_data/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/mfelsche%2Fdwd_weather_data/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/mfelsche%2Fdwd_weather_data/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/mfelsche","download_url":"https://codeload.github.com/mfelsche/dwd_weather_data/tar.gz/refs/heads/master","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/mfelsche%2Fdwd_weather_data/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":278399609,"owners_count":25980329,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","status":"online","status_checked_at":"2025-10-04T02:00:05.491Z","response_time":63,"last_error":null,"robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":true,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":[],"created_at":"2024-10-12T13:36:11.487Z","updated_at":"2025-10-05T01:39:31.460Z","avatar_url":"https://github.com/mfelsche.png","language":"Python","readme":"============\nWeather Data\n============\n\nDownload hourly weather measurements from all over germany from the\nDWD (Deutscher Wetterdienst) and convert it to gzipped json files suitable\nfor importing into your `crate`_ cluster.\n\nRequirements\n============\n\n * ``python3``\n\nInitial Setup for Development\n=============================\n\nIts suggested to use a virtualenv for development.\n\nBootstrap with python::\n\n    $ /path/to/python bootstrap.py\n    $ bin/buildout -N\n\nDownload Source Data from DWD\n=============================\n\nUse the ``bin/download`` command to download the sources from the DWD ftp server\nto a folder of your choice.\n\nTo download all files to the folder ``/tmp/weather``:\n\n    $ bin/download --download-dir /tmp/weather all\n\nIt is also possible to download seperate kinds of data only. Instead of ``all``\nuse one or more of these categories:\n\n    * precipitation\n    * sun\n    * air_temperature\n    * solar\n    * soil_temperature\n    * cloudiness\n    * pressure\n    * wind\n\n\nConvert Data to Gzipped Json\n============================\n\nTo convert the downloaded files, use the ``bin/parse_data`` command.\nIt will convert the downloaded data which contains a csv file for each station for each\nmeasurement category into one file with all available data for one station.\n\nThe data can be converted into a normalized and denormalized dataset.\nBoth create one file per station which contains all its measurement data.\n\nConverting normalized dataset will create two special files:\n\n * ``stations.json.gz`` containing the station id and names\n * ``station_locations.json.gz`` containing the location and height of the station\n   for a specific period, which might have changed over time.\n\nUsage\n-----\n\nTo convert all available data in ``/tmp/weather`` into ``json.gz`` files in the ``/tmp/out`` folder\ncontaining a denormalized dataset::\n\n    $ bin/parse_data --download-dir /tmp/weather --out-dir /tmp/out\n\nFor normalized output, use the ``--normalized`` flag.\n\nIt is also possible to only convert data for single stations by giving their\nstation ids (which can be obtained from the names of the downloaded files)::\n\n    $ bin/parse_data --download-dir /tmp/weather --out-dir /tmp/out --station 00003 00044\n\n.. note::\n\n    When converting only single stations for a normalized dataset, the ``stations``\n    and ``station_locations`` files will contain only data for the converted stations.\n    Old data will be overridden.\n\nData\n====\n\nWe got 236 mio. rows (235603810), each containing one or more measurement at one station\nat one point in time. The amount of data might increase once the DWD will add new data.\n\nMeasurements were taken at 1455 different stations, which might have changed locations\nover time, so we have 3946 different locations for those stations in different time periods.\n\nThe earliest measurements have been taken at ``Thu Jan 01 03:00:00 UTC 1891``\nin Marburg-Cappel, Kiel-Kronshagen and Kassel-Harleshausen.\n\nSchemas\n=======\n\nDenormalized\n------------\n\nSee ``german_climate_denormalized.sql``.\n\n.. code-block:: sql\n\n    CREATE TABLE german_climate_denormalized (\n      date timestamp,\n      station_id string,\n      station_name string,\n      position geo_point, -- position of the weather station\n      station_height int, -- height of the weather station\n      temp float, -- temperature in °C\n      humility double, -- relative humility in percent\n      cloudiness int,  -- 0 (cloudless)\n                       -- 1 or less (nearly cloudless)\n                       -- 2 (less cloudy)\n                       -- 3\n                       -- 4 (cloudy)\n                       -- 5\n                       -- 6 (more cloudy)\n                       -- 7 or more (nearly overcast)\n                       -- 8 (overcast)\n                       -- -1 not available\n      rainfall_fallen boolean, -- if some precipitation happened this hour\n      rainfall_height double,  -- precipitation height in mm\n      rainfall_form int, -- 0 - no precipiation\n                         -- 1 - only \"distinct\" (german: \"abgesetzte\") precipitation\n                         -- 2 - only liquid \"distinct\" precipitation (e.g. dew)\n                         -- 3 - only solid \"distinct\" precipitation (e.g. frost)\n                         -- 6 - liquid\n                         -- 7 - solid\n                         -- 8 - solid and liquid\n                         -- 9 - no measurement\n      air_pressure double,  -- air pressure (Pa)\n      air_pressure_station_height double, -- air pressure at station height (Pa)\n      ground_temp array(float), -- soil temperature in °C at 2cm, 5cm, 10cm, 20cm and 50cm depth\n      sunshine_duration double, -- sum of sunshine duration in that hour in minutes\n      diffuse_sky_radiation double, -- sum of diffuse short-wave sky-radiation in J/cm² for that hour\n      global_radiation double, -- sum of global short-wave radiation in J/cm² for that hour\n      sun_zenith float, -- solar zenith angle (https://en.wikipedia.org/wiki/Solar_zenith_angle) in degree\n      wind_speed double, -- wind speed in m/sec\n      wind_direction int -- wind direction given in 36-part land-spout\n    ) clustered by (station_id) with (number_of_replicas=0, refresh_interval=0);\n\n\nNormalized\n----------\n\nThis example schema uses a custom schema name.\n\nSee ``german_climate_normalized.sql``.\n\n.. code-block:: sql\n\n    -- a weather station\n    CREATE TABLE german_climate.stations (\n      id string primary key,\n      name string\n    ) with (number_of_replicas=0, refresh_interval=0); -- settings for import purposes only\n\n    -- the location of a weather station which might have changed over time\n    CREATE TABLE german_climate.station_locations (\n      station_id string,\n      position geo_point,\n      height int, -- height in m\n      from_date timestamp, -- station has been at this location from this point in time (inclusive)\n      to_date timestamp    -- station has been at this location up to that point in time (inclusive)\n    ) clustered by (station_id)\n    with (number_of_replicas=0, refresh_interval=0); -- settings for import purposes only\n\n\n    -- the actual measurement\n    -- might not contain data for every possible column\n    CREATE TABLE german_climate.data (\n      date timestamp primary key,\n      station_id string primary key,\n      temp float, -- temperature in °C\n      humility double, -- relative humility in percent\n      cloudiness int,  -- 0 (cloudless)\n                       -- 1 or less (nearly cloudless)\n                       -- 2 (less cloudy)\n                       -- 3\n                       -- 4 (cloudy)\n                       -- 5\n                       -- 6 (more cloudy)\n                       -- 7 or more (nearly overcast)\n                       -- 8 (overcast)\n                       -- -1 not available\n      rainfall_fallen boolean, -- if some precipitation happened this hour\n      rainfall_height double,  -- precipitation height in mm\n      rainfall_form int, -- 0 - no precipitation\n                         -- 1 - only \"distinct\" (german: \"abgesetzte\") precipitation\n                         -- 2 - only liquid \"distinct\" precipitation (e.g. dew)\n                         -- 3 - only solid \"distinct\" precipitation (e.g. frost)\n                         -- 6 - liquid\n                         -- 7 - solid\n                         -- 8 - solid and liquid\n                         -- 9 - no measurement\n      air_pressure double,  -- air pressure (Pa)\n      air_pressure_station_height double, -- air pressure at station height (Pa)\n      ground_temp array(float), -- soil temperature in °C at 2cm, 5cm, 10cm, 20cm and 50cm depth\n      sunshine_duration double, -- sum of sunshine duration in that hour in minutes\n      diffuse_sky_radiation double, -- sum of diffuse short-wave sky-radiation in J/cm² for that hour\n      global_radiation double, -- sum of global short-wave radiation in J/cm² for that hour\n      sun_zenith float, -- solar zenith angle (https://en.wikipedia.org/wiki/Solar_zenith_angle) in degree\n      wind_speed double, -- wind speed in m/sec\n      wind_direction int -- wind direction given in 36-part land-spout\n    ) clustered by (station_id) with (number_of_replicas=0, refresh_interval=0);\n\n\n.. _crate: https://crate.io\n","funding_links":[],"categories":[],"sub_categories":[],"project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fmfelsche%2Fdwd_weather_data","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fmfelsche%2Fdwd_weather_data","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fmfelsche%2Fdwd_weather_data/lists"}