https://github.com/gexarcha/dsc

Software implementation of the algorithm and experiments in the Discrete Sparse Coding paper
https://github.com/gexarcha/dsc

discrete-sparse-coding machine-learning sparse-representations

Last synced: about 1 year ago
JSON representation

Software implementation of the algorithm and experiments in the Discrete Sparse Coding paper

Host: GitHub
URL: https://github.com/gexarcha/dsc
Owner: gexarcha
License: afl-3.0
Created: 2018-07-11T11:22:24.000Z (almost 8 years ago)
Default Branch: master
Last Pushed: 2019-06-19T09:36:08.000Z (about 7 years ago)
Last Synced: 2025-02-08T13:43:06.800Z (over 1 year ago)
Topics: discrete-sparse-coding, machine-learning, sparse-representations
Language: Python
Homepage:
Size: 90.8 KB
Stars: 2
Watchers: 2
Forks: 0
Open Issues: 0
Metadata Files:
- Readme: README.md
- License: LICENSE.txt

Awesome Lists containing this project

README

          
# Introduction

This package contains all the source code to reproduce the numerical

experiments described in the paper [Discrete Sparse Coding](https://gexarcha.github.io/files/papers/NECO-09-16-2696R2-PDF.pdf). 

## Software dependencies

 

 * Python (>= 2.6)

 * NumPy (reasonably recent)

 * SciPy (reasonably recent)

 * pytables (reasonably recent)

 * mpi4py (>= 1.3)

## Overview 

* pulp/       - Python library/framework for MPI parallelized 

              EM-based algorithms. The models' implementations

              can be found in pulp/em/camodels/.

* examples/   - Small examples for initializing and running the models

## Running

To run the barstest experiment:

```bash

  $ cd examples/barstest

  $ python dsc_run.py

```

To run the natural images experiment:

```bash 

  $ cd ../natims

  $ python dsc_run.py

```

To run the spikes experiment:

```bash

  $ cd ../spikes

  $ python dsc_on_hc1_run.py

```

To run the audio experiment:

```bash

  $ cd ../audio

  $ python dsc_run_audio.py

```

Some of this experiments are too big to run in a single workstation

and should be executed on a cluster. Running our experiments on the 

cluster largely depends on the configuration. Example batch files 

for our cluster (slurm based) configuration (GOLD cluster - Uni Oldenburg) are

given in examples//batchscript.sh

## Results/Output 

The results produced by the code are stored in a 'results.h5' file 

under "./output/.../". The file stores the model parameters (e.g., W, pi etc.) 

for each EM iteration performed. To read the results file, you can use

openFile function of the standard tables package in python. Moreover, the

results files can also be easily read by other packages such as Matlab etc.

## Running on a parallel architecture

The code uses MPI based parallelization. If you have parallel resources

(i.e., a multi-core system or a compute cluster), the provided code can make a 

use of parallel compute resources by evenly distributing the training data 

among multiple cores.

To run the same script as above, e.g., 

a) On a multi-core machine with 32 cores:

```bash

 $ mpirun -np 32 python dsc_run.py

```

b) On a cluster:

```bash

 $ mpirun --hostfile machines python dsc_run.py

```

 where 'machines' contains a list of suitable machines.

See your MPI documentation for the details on how to start MPI parallelized 

programs.

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/gexarcha/dsc

Awesome Lists containing this project

README