{"id":28913898,"url":"https://github.com/glencrawford/australia_rain_tomorrow_binary_classification_prediction","last_synced_at":"2026-05-01T08:31:40.335Z","repository":{"id":138946885,"uuid":"236455045","full_name":"GlenCrawford/australia_rain_tomorrow_binary_classification_prediction","owner":"GlenCrawford","description":"Binary classification model to predict whether or not it will rain tomorrow with a Tensorflow/Keras and scikit-learn neural network.","archived":false,"fork":false,"pushed_at":"2020-02-04T11:19:37.000Z","size":3839,"stargazers_count":2,"open_issues_count":0,"forks_count":0,"subscribers_count":1,"default_branch":"master","last_synced_at":"2025-06-21T21:11:44.558Z","etag":null,"topics":["binary-classification","classification","keras","machine-learning","neural-network","python","scikit-learn","tensorflow"],"latest_commit_sha":null,"homepage":"","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/GlenCrawford.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null}},"created_at":"2020-01-27T09:21:21.000Z","updated_at":"2020-07-11T09:16:25.000Z","dependencies_parsed_at":null,"dependency_job_id":"f169ed67-c866-46ff-b504-9639a3aa1b05","html_url":"https://github.com/GlenCrawford/australia_rain_tomorrow_binary_classification_prediction","commit_stats":null,"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"purl":"pkg:github/GlenCrawford/australia_rain_tomorrow_binary_classification_prediction","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/GlenCrawford%2Faustralia_rain_tomorrow_binary_classification_prediction","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/GlenCrawford%2Faustralia_rain_tomorrow_binary_classification_prediction/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/GlenCrawford%2Faustralia_rain_tomorrow_binary_classification_prediction/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/GlenCrawford%2Faustralia_rain_tomorrow_binary_classification_prediction/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/GlenCrawford","download_url":"https://codeload.github.com/GlenCrawford/australia_rain_tomorrow_binary_classification_prediction/tar.gz/refs/heads/master","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/GlenCrawford%2Faustralia_rain_tomorrow_binary_classification_prediction/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":32490810,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-04-30T13:12:12.517Z","status":"online","status_checked_at":"2026-05-01T02:00:05.856Z","response_time":64,"last_error":null,"robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":true,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["binary-classification","classification","keras","machine-learning","neural-network","python","scikit-learn","tensorflow"],"created_at":"2025-06-21T21:11:04.461Z","updated_at":"2026-05-01T08:31:40.330Z","avatar_url":"https://github.com/GlenCrawford.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"# Binary classification machine learning model to predict whether it will rain tomorrow in Australia.\n\nThis is a neural network that uses binary classification to predict whether, given meteorological observations of a given day at a given weather station in Australia, it will rain there the next day. The model is trained and tested on a dataset containing about 10 years of daily weather observations from numerous Australian weather stations.\n\nThere are two separate implementations in this project: one using Tensorflow 2 and Keras, and another using scikit-learn.\n\nThe model currently has an accuracy of approximately 87%. Given that it doesn't rain exactly 50% of days, there are a lot more rows in the dataset where the target \"RainTomorrow\" column has a \"No\" value than \"Yes\". This means that you can make a complete guess and be right by random chance about 70% of the time. My goal was therefore to get the model accuracy to somewhere around 90%.\n\nHere is the structure of the dataset used for training and testing, showing the header and two data rows:\n\n| Date       | Location | MinTemp | MaxTemp | Rainfall | Evaporation | Sunshine | WindGustDir | WindGustSpeed | WindDir9am | WindDir3pm | WindSpeed9am | WindSpeed3pm | Humidity9am | Humidity3pm | Pressure9am | Pressure3pm | Cloud9am | Cloud3pm | Temp9am | Temp3pm | RainToday | RISK_MM | RainTomorrow |\n|:----------:|:--------:|:-------:|:-------:|:--------:|:-----------:|:--------:|:-----------:|:-------------:|:----------:|:----------:|:------------:|:------------:|:-----------:|:-----------:|:-----------:|:-----------:|:--------:|:--------:|:-------:|:-------:|:---------:|:-------:|:------------:|\n| 2010-10-20 | Sydney   | 12.9    | 20.3    | 0.2      | 3           | 10.9     | ENE         | 37            | W          | E          | 11           | 26           | 70          | 57          | 1028.8      | 1025.6      | 3        | 1        | 16.9    | 19.8    | No        | 0       | No           |\n| 2017-06-25 | Brisbane | 11      | 24.2    | 0        | 2.2         | 9.8      | ENE         | 20            | SSW        | NNE        | 2            | 7            | 68          | 53          | 1020.5      | 1017.3      | 6        | 3        | 15.9    | 22.6    | No        | 0       | Yes          |\n\nThe data was sourced from this [Kaggle dataset](https://www.kaggle.com/jsphyg/weather-dataset-rattle-package) compiled by Joe Young and Adam Young, which was in turn sourced from [http://www.bom.gov.au/climate/data](http://www.bom.gov.au/climate/data) and [http://www.bom.gov.au/climate/dwo/](http://www.bom.gov.au/climate/dwo/). This data is available under a Creative Commons (CC) Attribution 3.0 licence. For details on the meaning of each observation, see [this page](http://www.bom.gov.au/climate/dwo/IDCJDW0000.shtml). Copyright Commonwealth of Australia, Bureau of Meteorology.\n\n## Requirements\n\n* Python (developed with version 3.7.4).\n\n* See dependencies.txt for packages and versions (and below to install).\n\n## Data preprocessing\n\nData preprocessing is done by a combination of Pandas (to drop NaN rows and map Yes/No strings into 1/0 binary integers), scikit-learn (to scale/normalize numeric features by calculating the z-score of each of their values), and Tensorflow to apply one-hot encoding to categorical features. The model's input layer is thus a combination of pre-normalized numeric features and one-hot encoded categorical features.\n\nThe following columns were skipped and not used as features for the model; all the rest were used:\n\n* __Date:__ Not relevant.\n\n* __RainToday:__ This is just a boolean representation of the numeric column \"Rainfall\". Experimented with adding this feature to the model, but had no effect on accuracy.\n\n* __RISK_MM:__ This is the amount of rain for the following day. This was used to create the label/target column \"RainTomorrow\". This would be used if the model was doing regression, rather than classification.\n\n* __RainTomorrow:__ Used as the training label/target.\n\nThe output of the model is just a single sigmoid-activation neuron which predicts target variable \"RainTomorrow\".\n\n## Setup\n\n* Clone the Git repository.\n\n* Install the dependencies:\n\n```bash\npip install -r dependencies.txt\n```\n\n## Run\n\n```bash\npython -W ignore tensor_flow.py\n```\n\nor\n\n```bash\npython -W ignore scikit_learn.py\n```\n\nNote that there is a [current bug in TensorFlow](https://github.com/tensorflow/tensorflow/issues/30609) where deprecation warnings are printed at the usage of feature columns, even though the new feature column API is indeed being used. It has been fixed and will be in a future release of TensorFlow. In the meantime, will just have to live with the warning output.\n\n## Monitoring/logging\n\nAfter training, run:\n\n```\n$ tensorboard --logdir logs/fit\nServing TensorBoard on localhost; to expose to the network, use a proxy or pass --bind_all\nTensorBoard 2.1.0 at http://localhost:6006/ (Press CTRL+C to quit)\n```\n\nThen open the above URL in your browser to view the model in TensorBoard.\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fglencrawford%2Faustralia_rain_tomorrow_binary_classification_prediction","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fglencrawford%2Faustralia_rain_tomorrow_binary_classification_prediction","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fglencrawford%2Faustralia_rain_tomorrow_binary_classification_prediction/lists"}