An open API service indexing awesome lists of open source software.

https://github.com/stephantul/somber

Recursive Self-Organizing Map/Neural Gas.
https://github.com/stephantul/somber

cython kohonen machine-learning neural-gas ng plsom recsom recurrent-neural-networks som unsupervised

Last synced: 11 months ago
JSON representation

Recursive Self-Organizing Map/Neural Gas.

Awesome Lists containing this project

README

          

SOMBER
======

**somber** (Somber Organizes Maps By Enabling Recurrence) is a collection of numpy/python implementations of various kinds of *Self-Organizing Maps* (SOMS), with a focus on SOMs for sequence data.

To the best of my knowledge, the sequential SOM algorithms implemented in this package haven't been open-sourced yet. If you do find examples, please let me know, so I can compare and link to them.

The package currently contains implementations of:

* Regular Som (SOM) (Kohonen, various publications)
* Recursive Som (RecSOM) (`Voegtlin, 2002 `_)
* Neural Gas (NG) (`Martinetz & Schulten, 1991 `_)
* Recursive Neural Gas (Voegtlin, 2002)
* Parameterless Som (`Berglund & Sitte, 2007 `_)

Because these various sequential SOMs rely on internal dynamics for convergence, i.e. they do not fixate on some external label like a regular Recurrent Neural Network, processing in a sequential SOM is currently strictly online. This means that every example is processed separately, and weight updates happen after every example. Research into the development of batching and/or multi-threading is currently underway.

If you need a fast regular SOM, check out `SOMPY `_, which is a direct port of the MATLAB Som toolbox.

Usage
-----

Care has been taken to make SOMBER easy to use, and function like a drop-in replacement for sklearn-like systems.
The non-recurrent SOMs take as input ``[M * N]`` arrays, where M is the number of samples and N is the number of features.
The recurrent SOMs take as input ``[M * S * N]`` arrays, where M is the number of sequences, S is the number of items per sequence, and N is the number of features.

Examples
--------

Colors
------

Color clustering is a kind of ``Hello, World`` for Soms, because it nicely demonstrates how SOMs create a continuous mapping.
The color dataset comes from this nice `blog `_

.. code-block:: python

import numpy as np

from somber import Som

X = np.array([[0., 0., 0.],
[0., 0., 1.],
[0., 0., 0.5],
[0.125, 0.529, 1.0],
[0.33, 0.4, 0.67],
[0.6, 0.5, 1.0],
[0., 1., 0.],
[1., 0., 0.],
[0., 1., 1.],
[1., 0., 1.],
[1., 1., 0.],
[1., 1., 1.],
[.33, .33, .33],
[.5, .5, .5],
[.66, .66, .66]])

color_names = ['black', 'blue', 'darkblue', 'skyblue',
'greyblue', 'lilac', 'green', 'red',
'cyan', 'violet', 'yellow', 'white',
'darkgrey', 'mediumgrey', 'lightgrey']

# initialize
s = Som((10, 10), learning_rate=0.3)

# train
# 10 updates with 10 epochs = 100 updates to the parameters.
s.fit(X, num_epochs=10, updates_epoch=10)

# predict: get the index of each best matching unit.
predictions = s.predict(X)
# quantization error: how well do the best matching units fit?
quantization_error = s.quantization_error(X)
# inversion: associate each node with the exemplar that fits best.
inverted = s.invert_projection(X, color_names)
# Mapping: get weights, mapped to the grid points of the SOM
mapped = s.map_weights()

import matplotlib.pyplot as plt

plt.imshow(mapped)

Sequences
---------

In this example, we will show that the RecursiveSOM is able to memorize short sequences which are generated by a markov chain.
We will also demonstrate that the RecursiveSOM can generate sequences which are consistent with the sequences on which it has been trained.

.. code-block:: python

import numpy as np

from somber import RecursiveSom
from string import ascii_lowercase

# Dumb sequence generator.
def seq_gen(num_to_gen, probas):

symbols = ascii_lowercase[:probas.shape[0]]
identities = np.eye(probas.shape[0])
seq = []
ids = []
r = 0
choices = np.arange(probas.shape[0])
for x in range(num_to_gen):
r = np.random.choice(choices, p=probas[r])
ids.append(symbols[r])
seq.append(identities[r])

return np.array(seq), ids

# Transfer probabilities.
# after an A, we have a 50% chance of B or C
# after B, we have a 100% chance of A
# after C, we have a 50% chance of B or C
# therefore, we will never expect sequential A or B, but we do expect
# sequential C.
probas = np.array(((0.0, 0.5, 0.5),
(1.0, 0.0, 0.0),
(0.0, 0.5, 0.5)))

X, ids = seq_gen(10000, probas)

# initialize
# alpha = contribution of non-recurrent part to the activation.
# beta = contribution of recurrent part to activation.
# higher alpha to beta ratio
s = RecursiveSom((10, 10),
learning_rate=0.3,
alpha=1.2,
beta=.9)

# train
# show a progressbar.
s.fit(X, num_epochs=100, updates_epoch=10, show_progressbar=True)

# predict: get the index of each best matching unit.
predictions = s.predict(X)
# quantization error: how well do the best matching units fit?
quantization_error = s.quantization_error(X)

# inversion: associate each node with the exemplar that fits best.
inverted = s.invert_projection(X, ids)

# find which sequences are mapped to which neuron.
receptive_field = s.receptive_field(X, ids)

# generate some data by starting from some position.
# the position can be anything, but must have a dimensionality
# equal to the number of weights.
starting_pos = np.ones(s.num_neurons)
generated_indices = s.generate(50, starting_pos)

# turn the generated indices into a sequence of symbols.
generated_seq = inverted[generated_indices]

TODO
----

See issues for TODOs/enhancements. If you use SOMBER, feel free to send me suggestions!

Contributors
------------

* Stéphan Tulkens

LICENSE
-------

MIT