{"id":13578324,"url":"https://github.com/MartinThoma/clana","last_synced_at":"2025-04-05T16:32:15.810Z","repository":{"id":24969999,"uuid":"102892750","full_name":"MartinThoma/clana","owner":"MartinThoma","description":"CLANA is a toolkit for classifier analysis.","archived":false,"fork":false,"pushed_at":"2023-02-08T02:38:56.000Z","size":962,"stargazers_count":30,"open_issues_count":13,"forks_count":4,"subscribers_count":3,"default_branch":"master","last_synced_at":"2025-03-23T19:38:38.189Z","etag":null,"topics":["analysis","classification","machine-learning","mit-license","python","python-3","python-3-5"],"latest_commit_sha":null,"homepage":"","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/MartinThoma.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null}},"created_at":"2017-09-08T18:32:53.000Z","updated_at":"2025-03-11T09:07:06.000Z","dependencies_parsed_at":"2024-01-12T18:28:22.036Z","dependency_job_id":"52c237f3-be54-4300-81f1-bdf7f4b6041f","html_url":"https://github.com/MartinThoma/clana","commit_stats":{"total_commits":120,"total_committers":4,"mean_commits":30.0,"dds":0.06666666666666665,"last_synced_commit":"1e87c500875d59b5d33c58fda7b2d5abc1c87eff"},"previous_names":[],"tags_count":4,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/MartinThoma%2Fclana","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/MartinThoma%2Fclana/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/MartinThoma%2Fclana/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/MartinThoma%2Fclana/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/MartinThoma","download_url":"https://codeload.github.com/MartinThoma/clana/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":247366581,"owners_count":20927543,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["analysis","classification","machine-learning","mit-license","python","python-3","python-3-5"],"created_at":"2024-08-01T15:01:29.458Z","updated_at":"2025-04-05T16:32:12.656Z","avatar_url":"https://github.com/MartinThoma.png","language":"Python","funding_links":[],"categories":["Python"],"sub_categories":[],"readme":"[![DOI](https://zenodo.org/badge/102892750.svg)](https://zenodo.org/badge/latestdoi/102892750)\n[![PyPI version](https://badge.fury.io/py/clana.svg)](https://badge.fury.io/py/clana)\n[![Python Support](https://img.shields.io/pypi/pyversions/clana.svg)](https://pypi.org/project/clana/)\n[![Documentation Status](https://readthedocs.org/projects/clana/badge/?version=latest)](http://clana.readthedocs.io/en/latest/?badge=latest)\n[![Build Status](https://travis-ci.org/MartinThoma/clana.svg?branch=master)](https://travis-ci.org/MartinThoma/clana)\n[![Coverage Status](https://coveralls.io/repos/github/MartinThoma/clana/badge.svg?branch=master)](https://coveralls.io/github/MartinThoma/clana?branch=master)\n[![Code style: black](https://img.shields.io/badge/code%20style-black-000000.svg)](https://github.com/psf/black)\n![GitHub last commit](https://img.shields.io/github/last-commit/MartinThoma/clana)\n![GitHub commits since latest release (by SemVer)](https://img.shields.io/github/commits-since/MartinThoma/clana/0.4.0)\n[![CodeFactor](https://www.codefactor.io/repository/github/martinthoma/clana/badge/master)](https://www.codefactor.io/repository/github/martinthoma/clana/overview/master)\n\n# clana\n\n`clana` is a library and command line application to visualize confusion matrices of\nclassifiers with lots of classes. The two key contribution of clana are\nConfusion Matrix Ordering (CMO) as explained in chapter 5 of [Analysis and Optimization of Convolutional Neural Network Architectures](https://arxiv.org/abs/1707.09725) and an optimization\nalgorithm to to achieve it. The CMO technique can be applied to any multi-class\nclassifier and helps to understand which groups of classes are most similar.\n\n\n## Installation\n\nThe recommended way to install clana is:\n\n```\n$ pip install clana --user --upgrade\n```\n\nIf you want the latest version:\n\n```\n$ git clone https://github.com/MartinThoma/clana.git; cd clana\n$ pip install -e . --user\n```\n\n## Usage\n\n```\n$ clana --help\nUsage: clana [OPTIONS] COMMAND [ARGS]...\n\n  Clana is a toolkit for classifier analysis.\n\n  See https://arxiv.org/abs/1707.09725, Chapter 4.\n\nOptions:\n  --version  Show the version and exit.\n  --help     Show this message and exit.\n\nCommands:\n  distribution   Get the distribution of classes in a dataset.\n  get-cm         Generate a confusion matrix from predictions and ground...\n  get-cm-simple  Generate a confusion matrix.\n  visualize      Optimize and visualize a confusion matrix.\n\n```\n\nThe visualize command gives you images like this:\n\n![Confusion Matrix after Confusion Matrix Ordering of the WiLI-2018 dataset](https://raw.githubusercontent.com/MartinThoma/clana/master/docs/cm-wili-2018.png)\n\n### MNIST example\n\n```\n$ cd docs/\n$ python mnist_example.py  # creates `train-pred.csv` and `test-pred.csv`\n$ clana get-cm --gt gt-train.csv  --predictions train-pred.csv --n 10\n2019-09-14 09:47:30,655 - root - INFO - cm was written to 'cm.json'\n$ clana visualize --cm cm.json --zero_diagonal\nScore: 13475\n2019-09-14 09:49:41,593 - root - INFO - n=10\n2019-09-14 09:49:41,593 - root - INFO - ## Starting Score: 13475.00\n2019-09-14 09:49:41,594 - root - INFO - Current: 13060.00 (best: 13060.00, hot_prob_thresh=100.0000%, step=0, swap=False)\n[...]\n2019-09-14 09:49:41,606 - root - INFO - Current: 9339.00 (best: 9339.00, hot_prob_thresh=100.0000%, step=238, swap=False)\nScore: 9339\nPerm: [0, 6, 5, 8, 3, 2, 1, 7, 9, 4]\n2019-09-14 09:49:41,639 - root - INFO - Classes: [0, 6, 5, 8, 3, 2, 1, 7, 9, 4]\nAccuracy: 93.99%\n2019-09-14 09:49:41,725 - root - INFO - Save figure at '/home/moose/confusion_matrix.tmp.pdf'\n2019-09-14 09:49:41,876 - root - INFO - Found threshold for local connection: 398\n2019-09-14 09:49:41,876 - root - INFO - Found 9 clusters\n2019-09-14 09:49:41,877 - root - INFO - silhouette_score=-0.012313948323292875\n    1: [0]\n    1: [6]\n    1: [5]\n    1: [8]\n    1: [3]\n    1: [2]\n    1: [1]\n    2: [7, 9]\n    1: [4]\n```\n\nThis gives\n\n![](https://raw.githubusercontent.com/MartinThoma/clana/master/docs/mnist_confusion_matrix.png)\n\n#### Label Manipulation\n\nPrepare a `labels.csv` which **has to have a header row**:\n\n```\n$ clana visualize --cm cm.json --zero_diagonal --labels mnist/labels.csv\n```\n\n![](https://raw.githubusercontent.com/MartinThoma/clana/master/docs/mnist_confusion_matrix_labels.png)\n\n\n### Data distribution\n\n```\n$ clana distribution --gt gt.csv --labels labels.csv [--out out/] [--long]\n```\n\nprints one line per label, e.g.\n\n```\n60% cat (56789 elements)\n20% dog (12345 elements)\n 5% mouse (1337 elements)\n 1% tux (314 elements)\n```\n\nIf `--out` is specified, it creates a horizontal bar chart. The first bar is\nthe most common class, the second bar is the second most common class, ...\n\nIt uses the short labels, except `--long` is added to the command.\n\n\n### Visualizations\n\nSee [visualizations](docs/visualizations.md)\n\n## Usage as a library\n\n```\n\u003e\u003e\u003e import numpy as np\n\u003e\u003e\u003e arr = np.array([[9, 4, 7, 3, 8, 5, 2, 8, 7, 6],\n                    [4, 9, 2, 8, 5, 8, 7, 3, 6, 7],\n                    [7, 2, 9, 1, 6, 3, 0, 8, 5, 4],\n                    [3, 8, 1, 9, 4, 7, 8, 2, 5, 6],\n                    [8, 5, 6, 4, 9, 6, 3, 7, 8, 7],\n                    [5, 8, 3, 7, 6, 9, 6, 4, 7, 8],\n                    [2, 7, 0, 8, 3, 6, 9, 1, 4, 5],\n                    [8, 3, 8, 2, 7, 4, 1, 9, 6, 5],\n                    [7, 6, 5, 5, 8, 7, 4, 6, 9, 8],\n                    [6, 7, 4, 6, 7, 8, 5, 5, 8, 9]])\n\u003e\u003e\u003e from clana.optimize import simulated_annealing\n\u003e\u003e\u003e result = simulated_annealing(arr)\n\u003e\u003e\u003e result.cm\narray([[9, 8, 7, 6, 5, 4, 3, 2, 1, 0],\n       [8, 9, 8, 7, 6, 5, 4, 3, 2, 1],\n       [7, 8, 9, 8, 7, 6, 5, 4, 3, 2],\n       [6, 7, 8, 9, 8, 7, 6, 5, 4, 3],\n       [5, 6, 7, 8, 9, 8, 7, 6, 5, 4],\n       [4, 5, 6, 7, 8, 9, 8, 7, 6, 5],\n       [3, 4, 5, 6, 7, 8, 9, 8, 7, 6],\n       [2, 3, 4, 5, 6, 7, 8, 9, 8, 7],\n       [1, 2, 3, 4, 5, 6, 7, 8, 9, 8],\n       [0, 1, 2, 3, 4, 5, 6, 7, 8, 9]])\n\u003e\u003e\u003e result.perm\narray([2, 7, 0, 4, 8, 9, 5, 1, 3, 6])\n```\n\nYou can visualize the `result.cm` and use the `result.perm` to get your labels\nin the same order:\n\n```\n# Just some example labels\n# ['0', '1', '2', '3', '4', '5', '6', '7', '8', '9', '10']\n\u003e\u003e\u003e labels = [str(el) for el in range(11)]\n\u003e\u003e\u003e np.array(labels)[result.perm]\narray(['2', '7', '0', '4', '8', '9', '5', '1', '3', '6'], dtype='\u003cU2')\n```\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2FMartinThoma%2Fclana","html_url":"https://awesome.ecosyste.ms/projects/github.com%2FMartinThoma%2Fclana","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2FMartinThoma%2Fclana/lists"}