{"id":18364489,"url":"https://github.com/yoch/sparse-som","last_synced_at":"2025-04-06T16:30:46.714Z","repository":{"id":57469689,"uuid":"98416076","full_name":"yoch/sparse-som","owner":"yoch","description":"Efficient Self-Organizing Map for Sparse Data","archived":false,"fork":false,"pushed_at":"2020-11-27T13:17:20.000Z","size":4988,"stargazers_count":19,"open_issues_count":1,"forks_count":6,"subscribers_count":3,"default_branch":"master","last_synced_at":"2025-03-22T03:11:14.883Z","etag":null,"topics":["algorithm","neural-nets","openmp","python","self-organizing-map","som","sparse-data"],"latest_commit_sha":null,"homepage":"","language":"C++","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"gpl-3.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/yoch.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null}},"created_at":"2017-07-26T11:36:41.000Z","updated_at":"2024-10-29T22:32:23.000Z","dependencies_parsed_at":"2022-09-19T14:51:36.222Z","dependency_job_id":null,"html_url":"https://github.com/yoch/sparse-som","commit_stats":null,"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/yoch%2Fsparse-som","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/yoch%2Fsparse-som/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/yoch%2Fsparse-som/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/yoch%2Fsparse-som/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/yoch","download_url":"https://codeload.github.com/yoch/sparse-som/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":247512381,"owners_count":20950848,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["algorithm","neural-nets","openmp","python","self-organizing-map","som","sparse-data"],"created_at":"2024-11-05T23:10:35.997Z","updated_at":"2025-04-06T16:30:45.432Z","avatar_url":"https://github.com/yoch.png","language":"C++","readme":"# sparse-som\n\nEfficient Implementation of Self-Organizing Map for Sparse Input Data.\n\nThis program uses an algorithm especially intended for sparse data, \nwhich much faster than the classical one on very sparse datasets\n(time-complexity depend to non-zero values only).\n\n#### Main features\n\n- Highly optimized for sparse data (LIBSVM format).\n- Support both online and batch SOM algorithms.\n- Parallel batch implementation (OpenMP).\n- OS independent.\n- [Python](https://pypi.python.org/pypi?:action=display\u0026name=sparse-som) support.\n\n## Build\n\nThe simplest way to build the cli tools from the main directory : `cd src \u0026\u0026 make all`.\nAfter the compilation terminates, the resulting executables may be found in the `build` directory.\n\nGCC is reccomended, but you can use another compiler if you want. C++11 support is required.\nOpenMP support is required to take advantage of parallelism (sparse-bsom).\n\n\n## Install\n\n####\n\nNo install required.\n\n#### Python\n\nTo install the python version, simply run `pip install sparse-som`.\n\n\n## Usage\n\n### CLI\n\n#### sparse-som\n\nTo use the *online* version :\n\n```\nUsage: sparse-som\n        -i infile        input file at libsvm sparse format\n        -y nrows         number of rows in the codebook\n        -x ncols         number of columns in the codebook\n        [ -d dim ]       force the dimension of codebook's vectors\n        [ -u ]           one based column indices (default is zero based)\n        [ -N ]           normalize the input vectors\n        [ -l cb ]        load codebook from binary file\n        [ -o|O cb ]      output codebook to filename (o:binary, O:text)\n        [ -c|C cl ]      output classification (c:without counts, C:with counts)\n        [ -n neig ]      neighborhood topology: 4=circ, 6=hexa, 8=rect (default 8)\n        [ -t n | -T e ]  number of training iterations or epochs (epoch = nrows)\n        [ -r r0 -R rN ]  radius at start and end (default r=(x+y)/2, R=0.5)\n        [ -a a0 -A aN ]  learning rate at start and end (default a=0.5, A=1.e-37)\n        [ -H rCool ]     radius cooling: 0=linear, 1=exponential (default 0)\n        [ -h aCool ]     alpha cooling: 0=linear, 1=exponential (default 0)\n        [ -s stdCf ]     sigma = radius * stdCf (default 0.3)\n        [ -v ]           increase verbosity level (default 0, max 2)\n```\n\n#### sparse-bsom\n\nTo use the *batch* version :\n\n```\nUsage: sparse-bsom\n        -i infile        input file at libsvm sparse format\n        -y nrows         number of rows in the codebook\n        -x ncols         number of columns in the codebook\n        [ -d dim ]       force the dimension of codebook's vectors\n        [ -u ]           one based column indices (default is zero based)\n        [ -N ]           normalize the input vectors\n        [ -l cb ]        load codebook from binary file\n        [ -o|O cb ]      output codebook to filename (o:binary, O:text)\n        [ -c|C cl ]      output classification (c:without counts, C:with counts)\n        [ -n neig ]      neighborhood topology: 4=circ, 6=hexa, 8=rect (default 8)\n        [ -T epoc ]      number of epochs (default 10)\n        [ -r r0 -R rN ]  radius at start and end (default r=(x+y)/2, R=0.5)\n        [ -H rCool ]     radius cooling: 0=linear, 1=exponential (default 0)\n        [ -s stdCf ]     sigma = radius * stdCf (default 0.3)\n        [ -v ]           increase verbosity level (default 0, max 2)\n```\n\nTo control the number of threads used by OpenMP, set to `OMP_NUM_THREADS` variable to the desired value, for example :\n\n```\nOMP_NUM_THREADS=4 sparse-bsom ...\n```\n\nIf undefined one thread per CPU is used.\n\n### Python\n\n\n```python\nimport numpy as np\nfrom scipy.sparse import csr_matrix\nfrom sklearn.datasets import load_digits\nfrom sklearn.metrics import classification_report\nfrom sparse_som import *\n\n# Load some dataset\ndataset = load_digits()\n\n# convert to sparse CSR format\nX = csr_matrix(dataset.data, dtype=np.float32)\n\n# setup SOM dimensions\nH, W = 12, 15   # Network height and width\n_, N = X.shape  # Nb. features (vectors dimension)\n\n################ Simple usage ################\n\n# setup SOM network\nsom = Som(H, W, N, topology.HEXA) # , verbose=True\nprint(som.nrows, som.ncols, som.dim)\n\n# reinit the codebook (not needed)\nsom.codebook = np.random.rand(H, W, N).\\\n                    astype(som.codebook.dtype, copy=False)\n\n# train the SOM\nsom.train(X)\n\n# get bmus for the data\nbmus = som.bmus(X)\n\n################ Use classifier ################\n\n# setup SOM classifier (using batch SOM)\ncls = SomClassifier(BSom, H, W, N)\n\n# train SOM, do calibration and predict labels\ny = cls.fit_predict(X, labels=dataset.target)\n\nprint('Quantization Error: %2.4f' % cls.quant_error)\nprint('Topographic  Error: %2.4f' % cls.topog_error)\nprint('='*50)\nprint(classification_report(dataset.target, y))\n```\n\nOther examples are available in the `python/examples` directory.\n\n\n## Documentation\n\n### CLI\n\n#### Files Format\n\nInput files must be at LIBSVM format.\n\n```\n\u003clabel\u003e \u003cindex1\u003e:\u003cvalue1\u003e \u003cindex2\u003e:\u003cvalue2\u003e ...\n.\n.\n.\n```\n\nEach line contains an instance and is ended by a '\\n' character. The pair `\u003cindex\u003e:\u003cvalue\u003e` gives a feature (attribute) value: `\u003cindex\u003e` is an integer starting from 0 and `\u003cvalue\u003e` is a real number. Indices must be in ASCENDING order. Labels in the file are only used for network calibration. If they are unknown, just fill the first column with any numbers.\n\n\n### Python documentation\n\nThe python documentation can be found at: http://sparse-som.readthedocs.io/en/latest/\n\n\n### API\n\nThe C++ API is not public yet, because things still may change.\n\n\n## How to cite this work\n\n```\n@InProceedings{melka-mariage:ijcci17,\n  author={Melka, Josu{\\'e} and Mariage, Jean-Jacques},\n  title={Efficient Implementation of Self-Organizing Map for Sparse Input Data},\n  booktitle={Proceedings of the 9th International Joint Conference on Computational Intelligence: IJCCI},\n  volume={1},\n  month={November},\n  year={2017},\n  address={Funchal, Madeira, Portugal},\n  pages={54-63},\n  publisher={SciTePress},\n  organization={INSTICC},\n  doi={10.5220/0006499500540063},\n  isbn={978-989-758-274-5},\n  url={http://www.ai.univ-paris8.fr/~jmelka/IJCCI_2017_20.pdf}\n}\n```\n\n```\n@Inbook{Melka2019,\n    author    = \"Melka, Josu{\\'e} and Mariage, Jean-Jacques\",\n    editor    = \"Sabourin, Christophe and Merelo, Juan Julian and Madani, Kurosh and Warwick, Kevin\",\n    title     = \"Adapting Self-Organizing Map Algorithm to Sparse Data\",\n    bookTitle = \"Computational Intelligence: 9th International Joint Conference, IJCCI 2017 Funchal-Madeira, Portugal, November 1-3, 2017 Revised Selected Papers\",\n    year      = \"2019\",\n    publisher = \"Springer International Publishing\",\n    address   = \"Cham\",\n    pages     = \"139--161\",\n    isbn      = \"978-3-030-16469-0\",\n    doi       = \"10.1007/978-3-030-16469-0_8\",\n    url       = \"https://doi.org/10.1007/978-3-030-16469-0_8\"\n}\n```\n\n","funding_links":[],"categories":[],"sub_categories":[],"project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fyoch%2Fsparse-som","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fyoch%2Fsparse-som","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fyoch%2Fsparse-som/lists"}