https://github.com/aka-buccia/progetto-statistica
EDA, Classification, and Linear Regression on a Weather Dataset for the NS-25 project
https://github.com/aka-buccia/progetto-statistica
classification mle python regression statistics
Last synced: about 1 month ago
JSON representation
EDA, Classification, and Linear Regression on a Weather Dataset for the NS-25 project
- Host: GitHub
- URL: https://github.com/aka-buccia/progetto-statistica
- Owner: aka-buccia
- License: mit
- Created: 2025-06-05T09:50:33.000Z (about 1 year ago)
- Default Branch: main
- Last Pushed: 2025-07-17T09:29:20.000Z (12 months ago)
- Last Synced: 2025-07-17T13:09:18.383Z (12 months ago)
- Topics: classification, mle, python, regression, statistics
- Language: Jupyter Notebook
- Homepage:
- Size: 12.9 MB
- Stars: 0
- Watchers: 0
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
# Numerical Statistics 25 - Weather Dataset analysis

## General
- /scripts/classification.py contains code for EDA and classification
- /scripts/linear_regression.py contains code for linear regression
Please note that linear regression analysis requires a processed (cleaned) dataset. Therefore, it's required to run `classification.py` at least once before executing `linear_regression.py`.
## Report
In the /reports directory there's available:
- **report.pdf** and **report.ipynb**: a detailed report of the analysis
- **project_presentation.pdf**: a general presentation of the project for exam evaluation
## Change city
The dataset contains data for 18 European cities. My analysis was conducted on Oslo, but the code is designed to work with other cities as well. You only need to change the value of the variable `citta` in both scripts. The selected city should have all the weather parameters registered, and among the 18 cities, only the following meet this requirement:
- Budapest
- Dusseldorf
- Maastricht
- Munchen
- Oslo
Feel free to modify the `citta` variable to explore the data for any of these cities.
## Dependencies
Install dependencies with
```bash
pip install -r requirements.txt
```
Eventually you can create a virtual enviroment
```bash
python3 -m venv weather_project
source weather_project/bin/activate
pip install -r requirements.txt
```
To deactivate when you're done
```bash
deactivate
```
## Reference
- Klein Tank, A.M.G. and Coauthors, 2002. Daily dataset of 20th-century surface
air temperature and precipitation series for the European Climate Assessment.
Int. J. of Climatol., 22, 1441-1453.
Data and metadata available at
- Florian Huber, Dafne van Kuppevelt, Peter Steinbach, Colin Sauze, Yang Liu, Berend Weel, "Will the sun shine? – An accessible dataset for teaching machine learning and deep learning", DOI TO BE ADDED!
Data and metadata available at
- Dataset available on Kaggle at