Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.

Awesome Lists | Featured Topics | Projects

https://github.com/cgorman/tensorflow-som

A multi-gpu implementation of the self-organizing map in TensorFlow
https://github.com/cgorman/tensorflow-som

neural-networks self-organizing-map tensorflow unsupervised-learning

Last synced: 3 months ago
JSON representation

A multi-gpu implementation of the self-organizing map in TensorFlow

Host: GitHub
URL: https://github.com/cgorman/tensorflow-som
Owner: cgorman
License: mit
Created: 2018-02-11T02:32:58.000Z (almost 7 years ago)
Default Branch: master
Last Pushed: 2022-08-01T20:44:34.000Z (over 2 years ago)
Last Synced: 2024-10-12T16:35:40.126Z (3 months ago)
Topics: neural-networks, self-organizing-map, tensorflow, unsupervised-learning
Language: Python
Size: 99.6 KB
Stars: 79
Watchers: 9
Forks: 35
Open Issues: 0
Metadata Files:
- Readme: README.md
- License: LICENSE

Awesome Lists containing this project

README

        **NOTE:** *This package is no longer maintained and trained SOMs may have issues with stability. I will refrain from archiving this repository for the time being because I may end up releasing an updated version for PyTorch which I will link to before archiving. Like everything else, use this code at your own risk and please do some sanity checks. Thanks!*

# TensorFlow Self-Organizing Map

An implementation of the Kohonen self-organizing map¹ for TensorFlow 1.5 and Python 3.6.

**A Tensorflow V2 version has been contributed by [Dragan Avramovski](https://github.com/dragan-avramovski) and is in the tfv2 branch. (Thanks Dragan!)**

This was initially based

off of [Sachin Joglekar's](https://codesachin.wordpress.com/2015/11/28/self-organizing-maps-with-googles-tensorflow/)

code but has a few key modifications:

 * Uses TensorFlow broadcasting semantics instead of `tf.pack` and `for` loops.

 * Input data is expected from a `Tensor` rather than a `tf.placeholder`, allowing for use with faster and more complex input data pipelines.

 * Training uses the batch algorithm rather than the online one, providing a major speed boost if you have the GPU RAM.

 Also, as a result of that, I added...

 * Multi-GPU support (for single machines with multiple GPUs, it doesn't have multi-node training).

 * Some summary operations for Tensorboard visualization

 `example.py` contains a simple example of its usage by training a SOM on a 3 cluster toy dataset. The resulting

 u-matrix should look something like this:

 ![Example U-Matrix](https://github.com/cgorman/tensorflow-som/blob/master/example_umatrix.png)

 

Note that the example requires scikit-learn to be installed.

 I was going to write a blog post about this but I ended up just repeating everything I wrote in the comments,

 so please read them if you'd like to understand the code. For reference, the batch formula for SOMs is

 

 ![SOM batch formula](https://github.com/cgorman/tensorflow-som/blob/master/batch_formula.gif)

 

 where theta is the neighborhood function and x is the input vector.

 

 The activity function turns the distance between each of the weights and an input vector into a value between 0 and 1, i.e. similar weights elicit a higher activity.

 The activity function is parameterized with the `output_sensitivity` variable.

 When this value is close to zero the range of distances that elicit high activity is wider, and vice versa.

 Here is an example of a few different values of the output sensitivity (`-c` here):

 

 ![Effect of Output Sensitivity Parameter](https://github.com/cgorman/tensorflow-som/blob/master/output_sens.png)

 

## Note about the learning rate

There was a really dumb bug in commits before 2a0ee25 where the learning rate (alpha) was set extremely incorrectly.

Instead of shrinking from n to 0, it grew from n to 1.

If you had bad luck with this module previously, this may fix it.

Sorry for the goof!

 ¹http://ieeexplore.ieee.org/document/58325/