Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.

Awesome Lists | Featured Topics | Projects

https://github.com/francesco-mannella/kmeans-kohonen

Last synced: 21 days ago
JSON representation

Host: GitHub
URL: https://github.com/francesco-mannella/kmeans-kohonen
Owner: francesco-mannella
Created: 2018-03-15T08:35:42.000Z (almost 7 years ago)
Default Branch: master
Last Pushed: 2020-08-10T17:27:22.000Z (over 4 years ago)
Last Synced: 2024-11-07T09:24:04.576Z (2 months ago)
Language: Python
Size: 68.2 MB
Stars: 0
Watchers: 3
Forks: 0
Open Issues: 0
Metadata Files:
- Readme: README.md

Awesome Lists containing this project

README

        # kmeans-kohonen 

![](som.gif)

The [k-means](https://goo.gl/6qvLx2) algorithm and the [kohonen self organizing map](https://goo.gl/8bNsh) are very closely related:

## K-means

the k-means algorithm is based on optimizing the distance of the patterns in the dataset from k different prototypes.

k-means algorithm:

* **iterate**

  * **E-step** Each pattern is matched over all prototypes. The nearest prototipe indicate the cluster to which the pattern belongs 

  * **M-step** The mean of the patterns in a cluster replaces the prototypes for that cluster 

The cost function to minimize is just the sum of the squared distances of the dataset patterns from the related "winner" prototypes (nearest).

### Toy example [(code)](kmeans-toy.py):

    dataset

    

    # each row is a pattern 

    x =  [[1.0, 0.0, 0.0],

          [0.0, 1.0, 0.0],

          [0.0, 0.0, 1.0]]

           

    initial weights

    

    # each row is a prototype 

    c = [[0.02, 0.01, 0.03],

         [0.01, 0.04, 0.07],

         [0.09, 0.02, 0.02]]

**iter 0**

		

		squared_distances:

			   c0	   c1	   c2

		x0	0.961	0.987	0.829	

			    0	    0	    1	

		x1	0.981	0.927	0.969	

			    0	    1	    0	

		x2	0.941	0.867	0.969	

			    0	    1	    0	

		

		cost function:

		0.9104 + 0.9626 + 0.9309 = 2.8040

		

		optimized weights:

		[[0.02  0.01  0.015]

		 [0.01  0.52  0.07 ]

		 [0.09  0.01  0.02 ]]

		

**iter 1**

		

		squared_distances:

			   c0	   c1	   c2

		x0	0.961	1.255	0.829	

			    0	    0	    1	

		x1	0.981	0.235	0.989	

			    0	    1	    0	

		x2	0.971	1.135	0.969	

			    0	    0	    1	

		

		cost function:

		0.9103 + 0.4852 + 0.9842 = 2.3796

		

		optimized weights:

		[[0.02  0.01  0.007]

		 [0.01  0.76  0.07 ]

		 [0.09  0.01  0.51 ]]

		

**... ...**

**iter 9**

		

		squared_distances:

			   c0	   c1	   c2

		x0	0.000	1.981	1.821	

			    1	    0	    0	

		x1	1.965	0.005	1.981	

			    0	    1	    0	

		x2	1.970	1.861	0.008	

			    0	    0	    1	

		

		cost function:

		0.0147 + 0.0707 + 0.0906 = 0.1760

		

		optimized weights:

		[[0.996 0.01  0.007]

		 [0.01  0.999 0.07 ]

		 [0.09  0.01  0.998]]

## Kohonen map 

A kohonen map can be viewed as a version of the k-means algorithm in which the squared distance selected to compute the cost function for each input pattern is not only the closest one but also a neighborhood of it in the layer of prototypes.

### Toy example (same data of k-means - see [code](kmeans-toy.py)):

**iter 0**

		

		squared_distances:

			   c0	   c1	   c2

		x0	0.961	0.987	0.829	

			0.018	0.368	1.000	

		x1	0.981	0.927	0.969	

			0.368	1.000	0.368	

		x2	0.941	0.867	0.969	

			0.368	1.000	0.368	

		

		cost function:

		0.9805 + 0.3654 + 0.0167 + 0.3644 + 0.9626 + 0.3621 + 0.0178 + 0.3425 + 0.9843 = 4.3963

		

		optimized weights:

		[[0.029 0.008 0.015]

		 [0.008 0.52  0.057]

		 [0.073 0.01  0.2  ]]

		

 **iter 1**

		

		squared_distances:

			   c0	   c1	   c2

		x0	0.943	1.257	0.899	

			0.018	0.368	1.000	

		x1	0.985	0.234	1.026	

			0.368	1.000	0.368	

		x2	0.971	1.159	0.645	

			0.018	0.368	1.000	

		

		cost function:

		0.9712 + 0.4125 + 0.0174 + 0.3651 + 0.4835 + 0.3726 + 0.0180 + 0.3961 + 0.8032 = 3.8395

		

		optimized weights:

		[[0.038 0.007 0.007]

		 [0.007 0.76  0.047]

		 [0.073 0.008 0.6  ]]

		

**... ...**

		

**iter 9**

		

		squared_distances:

			   c0	   c1	   c2

		x0	0.000	1.993	1.862	

			1.000	0.368	0.018	

		x1	1.982	0.000	1.994	

			0.368	1.000	0.368	

		x2	1.971	1.974	0.005	

			0.018	0.368	1.000	

		

		cost function:

		0.0104 + 0.5194 + 0.0250 + 0.5179 + 0.0115 + 0.5195 + 0.0257 + 0.5169 + 0.0683 = 2.2146

		

		optimized weights:

		[[0.996 0.001 0.007]

		 [0.001 0.999 0.009]

		 [0.068 0.002 0.998]]

			

## Further codes 

* ***[kmeans.py](algs/kmeans.py)*** tensorflow implementation of kmeans. Used for the clustering of the [MNIST](https://en.wikipedia.org/wiki/MNIST_database) dataset

* ***[kohonen.py](algs/kohonen.py)*** tensorflow implementation of kohonen. Used for the clustering of the [MNIST](https://en.wikipedia.org/wiki/MNIST_database) dataset