https://github.com/anas436/weather-station-clustering-using-dbscan-and-scikit-learn-with-python

basemap cluster csv dbscan density-based-clustering jupyterlab matplotlib numpy pandas pylab python3 rcparams scikit-learn sklearn warnings

Last synced: 3 months ago
JSON representation

Host: GitHub
URL: https://github.com/anas436/weather-station-clustering-using-dbscan-and-scikit-learn-with-python
Owner: Anas436
Created: 2022-09-08T17:44:47.000Z (almost 4 years ago)
Default Branch: main
Last Pushed: 2022-09-08T17:52:55.000Z (almost 4 years ago)
Last Synced: 2025-03-27T10:48:07.724Z (over 1 year ago)
Topics: basemap, cluster, csv, dbscan, density-based-clustering, jupyterlab, matplotlib, numpy, pandas, pylab, python3, rcparams, scikit-learn, sklearn, warnings
Language: Jupyter Notebook
Homepage:
Size: 1.57 MB
Stars: 0
Watchers: 2
Forks: 0
Open Issues: 0
Metadata Files:
- Readme: README.md

Awesome Lists containing this project

README

# Weather-Station-Clustering-using-DBSCAN-and-Scikit-learn-with-Python

## Objectives

After completing this lab you will be able to:

* Use DBSCAN to do Density based clustering
* Use Matplotlib to plot clusters

Most of the traditional clustering techniques, such as k-means, hierarchical and fuzzy clustering, can be used to group data without supervision.

However, when applied to tasks with arbitrary shape clusters, or clusters within a cluster, the traditional techniques might be unable to achieve good results. That is, elements in the same cluster might not share enough similarity or the performance may be poor.
Additionally, Density-based clustering locates regions of high density that are separated from one another by regions of low density. Density, in this context, is defined as the number of points within a specified radius.

In this section, the main focus will be manipulating the data and properties of DBSCAN and observing the resulting clustering.

Weather Station Clustering using DBSCAN & scikit-learn

DBSCAN is especially very good for tasks like class identification in a spatial context. The wonderful attribute of DBSCAN algorithm is that it can find out any arbitrary shape cluster without getting affected by noise. For example, this following example cluster the location of weather stations in Canada.
\
DBSCAN can be used here, for instance, to find the group of stations which show the same weather condition. As you can see, it not only finds different arbitrary shaped clusters, can find the denser part of data-centered samples by ignoring less-dense areas or noises.

Let's start playing with the data. We will be working according to the following workflow:

1. Loading data

* Overview data
* Data cleaning
* Data selection
* Clusteing

Name in the table
Meaning

Stn_Name
Station Name

Lat
Latitude (North+, degrees)

Long
Longitude (West - , degrees)

Prov
Province

Tm
Mean Temperature (°C)

DwTm
Days without Valid Mean Temperature

D
Mean Temperature difference from Normal (1981-2010) (°C)

Tx
Highest Monthly Maximum Temperature (°C)

DwTx
Days without Valid Maximum Temperature

Tn
Lowest Monthly Minimum Temperature (°C)

DwTn
Days without Valid Minimum Temperature

S
Snowfall (cm)

DwS
Days without Valid Snowfall

S%N
Percent of Normal (1981-2010) Snowfall

P
Total Precipitation (mm)

DwP
Days without Valid Precipitation

P%N
Percent of Normal (1981-2010) Precipitation

S_G
Snow on the ground at the end of the month (cm)

Pd
Number of days with Precipitation 1.0 mm or more

BS
Bright Sunshine (hours)

DwBS
Days without Valid Bright Sunshine

BS%
Percent of Normal (1981-2010) Bright Sunshine

HDD
Degree Days below 18 °C

CDD
Degree Days above 18 °C

Stn_No
Climate station identifier (first 3 digits indicate drainage basin, last 4 characters are for sorting alphabetically).

NA
Not Available

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/anas436/weather-station-clustering-using-dbscan-and-scikit-learn-with-python

Awesome Lists containing this project

README

Weather Station Clustering using DBSCAN & scikit-learn