https://github.com/stephantul/somber

Recursive Self-Organizing Map/Neural Gas.
https://github.com/stephantul/somber

cython kohonen machine-learning neural-gas ng plsom recsom recurrent-neural-networks som unsupervised

Last synced: about 1 year ago
JSON representation

Recursive Self-Organizing Map/Neural Gas.

Host: GitHub
URL: https://github.com/stephantul/somber
Owner: stephantul
License: mit
Created: 2016-11-07T09:29:20.000Z (over 9 years ago)
Default Branch: master
Last Pushed: 2022-12-27T15:43:55.000Z (over 3 years ago)
Last Synced: 2025-06-28T11:43:40.798Z (about 1 year ago)
Topics: cython, kohonen, machine-learning, neural-gas, ng, plsom, recsom, recurrent-neural-networks, som, unsupervised
Language: Python
Homepage:
Size: 497 KB
Stars: 52
Watchers: 6
Forks: 14
Open Issues: 6
Metadata Files:
- Readme: README.rst
- License: LICENSE

Awesome Lists containing this project

README

          SOMBER

======

**somber** (Somber Organizes Maps By Enabling Recurrence) is a collection of numpy/python implementations of various kinds of *Self-Organizing Maps* (SOMS), with a focus on SOMs for sequence data.

To the best of my knowledge, the sequential SOM algorithms implemented in this package haven't been open-sourced yet. If you do find examples, please let me know, so I can compare and link to them.

The package currently contains implementations of:

  * Regular Som (SOM) (Kohonen, various publications)

  * Recursive Som (RecSOM) (`Voegtlin, 2002 `_)

  * Neural Gas (NG) (`Martinetz & Schulten, 1991 `_)

  * Recursive Neural Gas (Voegtlin, 2002)

  * Parameterless Som (`Berglund & Sitte, 2007 `_)

Because these various sequential SOMs rely on internal dynamics for convergence, i.e. they do not fixate on some external label like a regular Recurrent Neural Network, processing in a sequential SOM is currently strictly online. This means that every example is processed separately, and weight updates happen after every example. Research into the development of batching and/or multi-threading is currently underway.

If you need a fast regular SOM, check out `SOMPY `_, which is a direct port of the MATLAB Som toolbox.

Usage

-----

Care has been taken to make SOMBER easy to use, and function like a drop-in replacement for sklearn-like systems.

The non-recurrent SOMs take as input ``[M * N]`` arrays, where M is the number of samples and N is the number of features.

The recurrent SOMs take as input ``[M * S * N]`` arrays, where M is the number of sequences, S is the number of items per sequence, and N is the number of features.

Examples

--------

Colors

------

Color clustering is a kind of ``Hello, World`` for Soms, because it nicely demonstrates how SOMs create a continuous mapping.

The color dataset comes from this nice `blog `_

.. code-block:: python

  import numpy as np

  from somber import Som

  X = np.array([[0., 0., 0.],

                [0., 0., 1.],

                [0., 0., 0.5],

                [0.125, 0.529, 1.0],

                [0.33, 0.4, 0.67],

                [0.6, 0.5, 1.0],

                [0., 1., 0.],

                [1., 0., 0.],

                [0., 1., 1.],

                [1., 0., 1.],

                [1., 1., 0.],

                [1., 1., 1.],

                [.33, .33, .33],

                [.5, .5, .5],

                [.66, .66, .66]])

  color_names = ['black', 'blue', 'darkblue', 'skyblue',

                 'greyblue', 'lilac', 'green', 'red',

                 'cyan', 'violet', 'yellow', 'white',

                 'darkgrey', 'mediumgrey', 'lightgrey']

  # initialize

  s = Som((10, 10), learning_rate=0.3)

  # train

  # 10 updates with 10 epochs = 100 updates to the parameters.

  s.fit(X, num_epochs=10, updates_epoch=10)

  # predict: get the index of each best matching unit.

  predictions = s.predict(X)

  # quantization error: how well do the best matching units fit?

  quantization_error = s.quantization_error(X)

  # inversion: associate each node with the exemplar that fits best.

  inverted = s.invert_projection(X, color_names)

  # Mapping: get weights, mapped to the grid points of the SOM

  mapped = s.map_weights()

  import matplotlib.pyplot as plt

  plt.imshow(mapped)

Sequences

---------

In this example, we will show that the RecursiveSOM is able to memorize short sequences which are generated by a markov chain.

We will also demonstrate that the RecursiveSOM can generate sequences which are consistent with the sequences on which it has been trained.

.. code-block:: python

  import numpy as np

  from somber import RecursiveSom

  from string import ascii_lowercase

  # Dumb sequence generator.

  def seq_gen(num_to_gen, probas):

      symbols = ascii_lowercase[:probas.shape[0]]

      identities = np.eye(probas.shape[0])

      seq = []

      ids = []

      r = 0

      choices = np.arange(probas.shape[0])

      for x in range(num_to_gen):

          r = np.random.choice(choices, p=probas[r])

          ids.append(symbols[r])

          seq.append(identities[r])

      return np.array(seq), ids

  # Transfer probabilities.

  # after an A, we have a 50% chance of B or C

  # after B, we have a 100% chance of A

  # after C, we have a 50% chance of B or C

  # therefore, we will never expect sequential A or B, but we do expect

  # sequential C.

  probas = np.array(((0.0, 0.5, 0.5),

                     (1.0, 0.0, 0.0),

                     (0.0, 0.5, 0.5)))

  X, ids = seq_gen(10000, probas)

  # initialize

  # alpha = contribution of non-recurrent part to the activation.

  # beta = contribution of recurrent part to activation.

  # higher alpha to beta ratio

  s = RecursiveSom((10, 10),

                   learning_rate=0.3,

                   alpha=1.2,

                   beta=.9)

  # train

  # show a progressbar.

  s.fit(X, num_epochs=100, updates_epoch=10, show_progressbar=True)

  # predict: get the index of each best matching unit.

  predictions = s.predict(X)

  # quantization error: how well do the best matching units fit?

  quantization_error = s.quantization_error(X)

  # inversion: associate each node with the exemplar that fits best.

  inverted = s.invert_projection(X, ids)

  # find which sequences are mapped to which neuron.

  receptive_field = s.receptive_field(X, ids)

  # generate some data by starting from some position.

  # the position can be anything, but must have a dimensionality

  # equal to the number of weights.

  starting_pos = np.ones(s.num_neurons)

  generated_indices = s.generate(50, starting_pos)

  # turn the generated indices into a sequence of symbols.

  generated_seq = inverted[generated_indices]

TODO

----

See issues for TODOs/enhancements. If you use SOMBER, feel free to send me suggestions!

Contributors

------------

* Stéphan Tulkens

LICENSE

-------

MIT

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/stephantul/somber

Awesome Lists containing this project

README