{"id":22582988,"url":"https://github.com/andrewwango/femda","last_synced_at":"2025-03-28T16:45:19.390Z","repository":{"id":70268811,"uuid":"350287395","full_name":"Andrewwango/femda","owner":"Andrewwango","description":"FEMDA: Robust classification with Flexible Discriminant Analysis in heterogeneous data. Flexible EM-Inspired Discriminant Analysis is a robust supervised classification algorithm that performs well in noisy and contaminated datasets.","archived":false,"fork":false,"pushed_at":"2022-09-06T13:17:59.000Z","size":17432,"stargazers_count":1,"open_issues_count":0,"forks_count":0,"subscribers_count":3,"default_branch":"main","last_synced_at":"2025-02-02T17:30:13.874Z","etag":null,"topics":["20newsgroup","classification","discriminant-analysis","em-algorithm","fashion-mnist","linear-discriminant-analysis","machine-learning","quadratic-discriminant-analysis","robust-estimation","robust-statistics"],"latest_commit_sha":null,"homepage":"","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/Andrewwango.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2021-03-22T09:43:04.000Z","updated_at":"2024-03-13T05:36:20.000Z","dependencies_parsed_at":"2023-05-11T20:00:33.934Z","dependency_job_id":null,"html_url":"https://github.com/Andrewwango/femda","commit_stats":null,"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Andrewwango%2Ffemda","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Andrewwango%2Ffemda/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Andrewwango%2Ffemda/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Andrewwango%2Ffemda/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/Andrewwango","download_url":"https://codeload.github.com/Andrewwango/femda/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":246068269,"owners_count":20718501,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["20newsgroup","classification","discriminant-analysis","em-algorithm","fashion-mnist","linear-discriminant-analysis","machine-learning","quadratic-discriminant-analysis","robust-estimation","robust-statistics"],"created_at":"2024-12-08T06:13:09.670Z","updated_at":"2025-03-28T16:45:19.001Z","avatar_url":"https://github.com/Andrewwango.png","language":"Python","readme":"# FEMDA: Robust classification with Flexible Discriminant Analysis in heterogeneous data\nFlexible EM-Inspired Discriminant Analysis is a robust supervised classification algorithm that performs well in noisy and contaminated datasets.\nCode for the paper on [IEEE](https://ieeexplore.ieee.org/document/9747576) and [arXiv](https://arxiv.org/abs/2201.02967).\n\n### Authors\nAndrew Wang, University of Cambridge, Cambridge, UK\nPierre Houdouin, CentraleSupélec, Paris, France\n\n## Instllation\n`pip install -i https://test.pypi.org/simple/ femda`\n\n## Get started\n```python\n\u003e\u003e\u003e from sklearn.datasets import load_iris\n\u003e\u003e\u003e from femda import FEMDA\n\u003e\u003e\u003e X, y = load_iris(return_X_y=True)\n\u003e\u003e\u003e clf = FEMDA()\n\u003e\u003e\u003e clf.fit(X, y)\nFEMDA()\n\u003e\u003e\u003e clf.score(X, y)\n0.9666666666666667\n```\n\nUsing a specific dataset...\n```python\n\u003e\u003e\u003e import femda.experiments.preprocessing as pre\n\u003e\u003e\u003e X_train, y_train, X_test, y_test = pre.statlog(r\"root\\datasets\\\\\")\n\u003e\u003e\u003e FEMDA().fit(X_train, y_train).score(X_test, y_test)\n...\n```\n\nUsing a `sklearn.pipeline.Pipeline`...\n\n```python\n\u003e\u003e\u003e from sklearn.datasets import load_digits\n\u003e\u003e\u003e from sklearn.pipeline import make_pipeline\n\u003e\u003e\u003e from sklearn.decomposition import PCA\n\u003e\u003e\u003e X, y = load_digits(return_X_y=True)\n\u003e\u003e\u003e pipe = make_pipeline(PCA(n_components=5), FEMDA()).fit(X, y)\n\u003e\u003e\u003e pipe.predict(X)\n...\n```\n\n## Run all experiments presented in the paper\n```python\n\u003e\u003e\u003e from femda.experiments import run_experiments()\n\u003e\u003e\u003e run_experiments()\n...\n```\n\nSee ![demo.ipynb](demo.ipynb) for more.\n\n## Abstract\nLinear and Quadraic Discriminant Analysis are well-known classical methods but suffer heavily from non-Gaussian class distributions and are very non-robust in contaminated datasets. In this paper, we present a new discriminant analysis style classification algorithm that directly models noise and diverse shapes which can deal with a wide range of datasets. \n\nEach data point is modelled by its own arbitrary Elliptically Symmetrical (ES) distribution and its own arbitrary scale parameter, modelling directly very heterogeneous, non-i.i.d datasets. We show that maximum-likelihood parameter estimation and classification are simple and fast under this model.\n\nWe highlight the flexibility of the model to a wide range of Elliptically Symmetrical distribution shapes and varying levels of contamination in synthetic datasets. Then, we show that our algorithm outperforms other robust methods on contaminated datasets from Computer Vision and NLP.\n","funding_links":[],"categories":[],"sub_categories":[],"project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fandrewwango%2Ffemda","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fandrewwango%2Ffemda","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fandrewwango%2Ffemda/lists"}