{"id":35310871,"url":"https://github.com/lv416e/archetypax","last_synced_at":"2026-04-17T10:32:38.999Z","repository":{"id":281703606,"uuid":"946101457","full_name":"lv416e/archetypax","owner":"lv416e","description":"ArchetypAX: Hardware-accelerated Archetypal Analysis implementation using JAX","archived":false,"fork":false,"pushed_at":"2025-05-02T03:59:16.000Z","size":532,"stargazers_count":2,"open_issues_count":0,"forks_count":0,"subscribers_count":1,"default_branch":"main","last_synced_at":"2025-12-26T13:29:30.968Z","etag":null,"topics":["archetypal-analysis","clustering-algorithm","convex-optimization","data-science","jax","machine-learning","matrix-factorization","representation-learning","unsupervised-learning"],"latest_commit_sha":null,"homepage":"","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"apache-2.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/lv416e.png","metadata":{"files":{"readme":"README.md","changelog":"CHANGELOG.md","contributing":"docs/contributing.rst","funding":null,"license":"LICENSE","code_of_conduct":"CODE_OF_CONDUCT.md","threat_model":null,"audit":null,"citation":null,"codeowners":".github/CODEOWNERS","security":"SECURITY.md","support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null}},"created_at":"2025-03-10T15:59:28.000Z","updated_at":"2025-11-04T09:47:30.000Z","dependencies_parsed_at":"2025-04-14T08:42:50.712Z","dependency_job_id":null,"html_url":"https://github.com/lv416e/archetypax","commit_stats":null,"previous_names":["lv416e/archetypax"],"tags_count":5,"template":false,"template_full_name":null,"purl":"pkg:github/lv416e/archetypax","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/lv416e%2Farchetypax","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/lv416e%2Farchetypax/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/lv416e%2Farchetypax/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/lv416e%2Farchetypax/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/lv416e","download_url":"https://codeload.github.com/lv416e/archetypax/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/lv416e%2Farchetypax/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":31925413,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-04-17T10:19:20.377Z","status":"ssl_error","status_checked_at":"2026-04-17T10:19:18.682Z","response_time":62,"last_error":"SSL_connect returned=1 errno=0 peeraddr=140.82.121.6:443 state=error: unexpected eof while reading","robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":false,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["archetypal-analysis","clustering-algorithm","convex-optimization","data-science","jax","machine-learning","matrix-factorization","representation-learning","unsupervised-learning"],"created_at":"2025-12-30T17:43:13.775Z","updated_at":"2026-04-17T10:32:38.991Z","avatar_url":"https://github.com/lv416e.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"# ArchetypAX – Hardware-accelerated Archetypal Analysis with JAX\n\n\u003c!--\nRepository topics for better discoverability:\narchetypal-analysis, jax, machine-learning, dimensionality-reduction, convex-hull-optimization\n--\u003e\n\n\u003e Discover extreme patterns in your data with GPU/TPU-accelerated Archetypal Analysis, high-performance convex hull optimization, and interpretable matrix factorization.\n\n[![License](https://img.shields.io/badge/License-Apache_2.0-blue.svg)](https://opensource.org/licenses/Apache-2.0)\n[![PyPI](https://img.shields.io/pypi/v/archetypax.svg?cache=no)](https://pypi.org/project/archetypax/)\n[![Tests](https://github.com/lv416e/archetypax/actions/workflows/tests.yml/badge.svg)](https://github.com/lv416e/archetypax/actions/workflows/tests.yml)\n[![Lint](https://github.com/lv416e/archetypax/actions/workflows/lint.yml/badge.svg)](https://github.com/lv416e/archetypax/actions/workflows/lint.yml)\n[![Docs](https://github.com/lv416e/archetypax/actions/workflows/docs.yml/badge.svg)](https://github.com/lv416e/archetypax/actions/workflows/docs.yml)\n[![Release](https://github.com/lv416e/archetypax/actions/workflows/release.yml/badge.svg)](https://github.com/lv416e/archetypax/actions/workflows/release.yml)\n\n## Table of Contents\n- [Overview](#overview)\n- [Features](#features)\n- [Installation](#installation)\n- [Quick Start](#quick-start)\n- [Import Patterns](#import-patterns)\n- [Documentation](#documentation)\n- [Examples](#examples)\n- [How It Works](#how-it-works)\n- [Changelog](#changelog)\n- [Citation](#citation)\n- [License](#license)\n- [Contributing](#contributing)\n- [Community](#community)\n\n## Overview\n\n`archetypax` is a high-performance implementation of Archetypal Analysis (AA) leveraging JAX for GPU acceleration.\u003cbr\u003e\n\nArchetypal Analysis is a powerful matrix factorization technique representing data points \u003cbr\u003e\nas convex combinations of extreme points (archetypes) found within the data's convex hull.\u003cbr\u003e\n\nUnlike traditional dimensionality reduction techniques like PCA, which finds abstract orthogonal components,\u003cbr\u003e\nAA discovers interpretable extremal points often corresponding to meaningful prototypes.\u003cbr\u003e\n\nThis makes it valuable for applications requiring both dimensionality reduction and **human-interpretable insights**,\u003cbr\u003e\nsuch as market segmentation, document analysis, and anomaly detection.\u003cbr\u003e\n\n## Features\n\n**Performance \u0026 Stability**\n- 🚀 GPU/TPU acceleration using JAX\n- 🧠 Smart initialization (k-means++, directional)\n- 🛠️ Numerical stability \u0026 convergence techniques\n\n**Usability \u0026 Compatibility**\n- 📊 scikit-learn compatible API (fit/transform)\n- 📋 Thorough documentation\n\n**Interpretability \u0026 Visualization**\n- 🔍 Meaningful interpretable archetypes\n- 📈 Advanced tracking \u0026 optimization trajectory monitoring\n- 🎯 Comprehensive evaluation \u0026 visualization tooling\n\n### Related Projects and Techniques\n\nArchetypAX can be used alongside or compared with these related approaches:\n\n- **PCA**: Principal Component Analysis finds orthogonal directions of maximum variance\n- **NMF**: Non-negative Matrix Factorization decomposes data into non-negative components\n- **k-means**: Clustering technique that partitions data into k clusters\n- **JAX Ecosystem**: Compatible with JAX-based machine learning frameworks like Flax\n- **scikit-learn**: Follows similar API conventions, allowing easy integration\n\n## Installation\n\nInstall with pip, uv, or poetry:\n\n```bash\n# pip\npip install archetypax\npip install git+https://github.com/lv416e/archetypax.git\n\n# uv\nuv pip install archetypax\nuv pip install git+https://github.com/lv416e/archetypax.git\n\n# poetry\npoetry add archetypax\npoetry add git+https://github.com/lv416e/archetypax.git\n```\n\nInstall optional dependencies:\n\n```bash\npip install archetypax[dev]       # Development dependencies\npip install archetypax[examples]  # Example dependencies\npip install archetypax[docs]      # Documentation dependencies\n```\n\n### Requirements\n\n| Type | Dependency | Version | Description |\n|------|------------|---------|-------------|\n| **Core** | Python | \u003e=3.10 | Required for modern language features and compatibility with JAX |\n| **Core** | JAX | \u003e=0.4.0 | Powers the hardware acceleration and automatic differentiation |\n| **Core** | NumPy | \u003e=1.20.0 | Handles core numerical operations and array manipulations |\n| **Core** | optax | \u003e=0.1.0 | JAX-based optimization framework for gradient-based updates |\n| **Core** | pandas | \u003e=1.3.0 | Data manipulation and analysis library |\n| **Core** | scikit-learn | \u003e=1.0.0 | Provides machine learning utilities and compatible interfaces |\n| **Examples** | jupyter | \u003e=1.0.0 | Interactive computing environment for notebooks |\n| **Examples** | matplotlib | \u003e=3.7.5 | Required for visualization functionality |\n| **Examples** | seaborn | \u003e=0.13.2 | Statistical data visualization |\n| **Dev** | black | ==23.7.0 | Code formatter |\n| **Dev** | mypy | \u003e=1.8.0 | Static type checker |\n| **Dev** | pytest | \u003e=7.0.0 | Testing framework |\n| **Dev** | ruff | \u003e=0.9.0 | Fast Python linter and formatter |\n\n## Quick Start\n\n```python\nimport numpy as np\nfrom archetypax import ImprovedArchetypalAnalysis as ArchetypalAnalysis\n\n# Generate sample data\nnp.random.seed(42)\nX = np.random.rand(1000, 10)\n\n# Initialize and fit the model\nmodel = ArchetypalAnalysis(n_archetypes=5)\nweights = model.fit_transform(X)\n\n# Get the archetypes\narchetypes = model.archetypes\n\n# Reconstruct the data\nX_reconstructed = model.reconstruct()\n\n# Calculate reconstruction error\nmse = np.mean((X - X_reconstructed) ** 2)\nprint(f\"Reconstruction MSE: {mse:.6f}\")\n```\n\n## Import Patterns\n\nArchetypAX supports multiple import patterns for flexibility:\n\n### Direct Class Imports (Recommended)\n\n```python\nfrom archetypax import ArchetypalAnalysis, ImprovedArchetypalAnalysis, BiarchetypalAnalysis, ArchetypeTracker\n```\n\n### Explicit Module Imports\n\n```python\nfrom archetypax.models.base import ArchetypalAnalysis\nfrom archetypax.models.biarchetypes import BiarchetypalAnalysis\nfrom archetypax.tools.evaluation import ArchetypalAnalysisEvaluator\nfrom archetypax.tools.tracker import ArchetypeTracker\n```\n\n### Module-Level Imports\n\n```python\nfrom archetypax.models import ArchetypalAnalysis\nfrom archetypax.tools import ArchetypalAnalysisVisualizer, ArchetypeTracker\n```\n\n## Changelog\n\nFor a detailed list of changes and version history, please see the [CHANGELOG.md](CHANGELOG.md) file.\n\n## Documentation\n\n### Parameters\n\n#### ArchetypalAnalysis / ImprovedArchetypalAnalysis\n\n| Parameter | Type | Default | Description |\n|-----------|------|---------|-------------|\n| `n_archetypes` | int | - | Number of archetypes to find |\n| `max_iter` | int | 500 | Maximum number of iterations |\n| `tol` | float | 1e-6 | Convergence tolerance |\n| `random_seed` | int | 42 | Random seed for initialization |\n| `learning_rate` | float | 0.001 | Learning rate for optimizer |\n| `lambda_reg` | float | 0.01 | Regularization strength for weight distribution |\n| `normalize` | bool | False | Whether to normalize features before fitting |\n| `projection_method` | str | \"cbap\" | Method for projecting archetypes (\"cbap\", \"convex_hull\", \"knn\") |\n| `projection_alpha` | float | 0.1 | Blending coefficient for boundary projection |\n| `archetype_init_method` | str | \"directional\" | Initialization strategy (\"directional\", \"kmeans++\", \"qhull\") |\n\n#### BiarchetypalAnalysis\n\n| Parameter | Type | Default | Description |\n|-----------|------|---------|-------------|\n| `n_row_archetypes` | int | - | Number of archetypes in observation space |\n| `n_col_archetypes` | int | - | Number of archetypes in feature space |\n| `max_iter` | int | 500 | Maximum number of iterations |\n| `tol` | float | 1e-6 | Convergence tolerance |\n| `random_seed` | int | 42 | Random seed for initialization |\n| `learning_rate` | float | 0.001 | Learning rate for optimizer |\n| `projection_method` | str | \"default\" | Method for projecting archetypes |\n| `lambda_reg` | float | 0.01 | Regularization strength for entropy terms |\n\n### Methods\n\n| Method | Returns | Description |\n|--------|---------|-------------|\n| `fit(X)` | model | Fit the model to the data |\n| `transform(X)` | array | Transform new data to archetype weights |\n| `fit_transform(X)` | array | Fit the model and transform the data |\n| `reconstruct(X)` | array | Reconstruct data from archetype weights |\n| `get_loss_history()` | array | Get the loss history from training |\n| `get_all_archetypes()` | tuple | Get both sets of archetypes (BiarchetypalAnalysis only) |\n| `get_all_weights()` | tuple | Get both sets of weights (BiarchetypalAnalysis only) |\n\n## Examples\n\n### Visualizing Archetypes in 2D Data\n\n```python\nimport numpy as np\nimport matplotlib.pyplot as plt\nfrom archetypax import ImprovedArchetypalAnalysis\nfrom archetypax.tools.visualization import ArchetypalAnalysisVisualizer\n\n# Generate some interesting 2D data (a triangle with points inside)\nn_samples = 500\nvertices = np.array([[0, 0], [1, 0], [0.5, 0.866]])\nweights = np.random.dirichlet(np.ones(3), size=n_samples)\nX = weights @ vertices\n\n# Fit archetypal analysis with 3 archetypes\nmodel = ImprovedArchetypalAnalysis(n_archetypes=3, archetype_init_method=\"directional\")\nmodel.fit(X)\n\n# Plot original data and archetypes\nplt.figure(figsize=(10, 8))\nArchetypalAnalysisVisualizer.plot_archetypes_2d(model, X)\nplt.title(\"Archetypal Analysis of 2D Data\")\nplt.show()\n```\n\n### Using Biarchetypal Analysis\n\n```python\nimport numpy as np\nimport matplotlib.pyplot as plt\nfrom archetypax import BiarchetypalAnalysis\nfrom archetypax.tools.visualization import ArchetypalAnalysisVisualizer\n\n# Generate synthetic data\nnp.random.seed(42)\nX = np.random.rand(500, 5)\n\n# Initialize and fit the model with row and column archetypes\nmodel = BiarchetypalAnalysis(\n    n_row_archetypes=2,   # Number of archetypes in observation space\n    n_col_archetypes=2,   # Number of archetypes in feature space\n    max_iter=500,\n    random_seed=42\n)\nmodel.fit(X)\n\n# Get both sets of archetypes\nrow_archetypes, col_archetypes = model.get_all_archetypes()\nprint(\"Row archetypes shape:\", row_archetypes.shape)\nprint(\"Column archetypes shape:\", col_archetypes.shape)\n\n# Get both sets of weights\nrow_weights, col_weights = model.get_all_weights()\nprint(\"Row weights shape:\", row_weights.shape)\nprint(\"Column weights shape:\", col_weights.shape)\n\n# Reconstruct data using biarchetypes\nX_reconstructed = model.reconstruct()\nmse = np.mean((X - X_reconstructed) ** 2)\nprint(f\"Reconstruction MSE: {mse:.6f}\")\n```\n\n### Tracking Archetype Evolution\n\n```python\nimport numpy as np\nimport matplotlib.pyplot as plt\nfrom archetypax import ArchetypeTracker\n\n# Generate sample data\nnp.random.seed(42)\nX = np.random.rand(1000, 10)\n\n# Initialize the tracker\ntracker = ArchetypeTracker(\n    n_archetypes=3,\n    max_iter=300,\n    random_seed=42\n)\n\n# Fit the model while tracking archetype movement\ntracker.fit(X)\n\n# Visualize the archetype movement trajectory\ntracker.visualize_movement()\n\n# Visualize boundary proximity over iterations\ntracker.visualize_boundary_proximity()\n```\n\n## How It Works\n\nArchetypal Analysis solves the following optimization problem:\n\nGiven a data matrix $\\mathbf{X} \\in \\mathbb{R}^{n \\times d}$ with n samples and d features, find k archetypes $\\mathbf{A} \\in \\mathbb{R}^{k \\times d}$ and weights $\\mathbf{W} \\in \\mathbb{R}^{n \\times k}$ such that:\n\n$$\n\\text{minimize} \\ \\| \\mathbf{X} - \\mathbf{W} \\cdot \\mathbf{A} \\|^2_{\\text{F}}\n$$\n\nsubject to:\n\n- $\\mathbf{W}$ is non-negative\n- Each row of $\\mathbf{W}$ sums to 1 (simplex constraint)\n- $\\mathbf{A}$ lies within the convex hull of $\\mathbf{X}$\n\nThe biarchetypal extension solves a more complex factorization:\n\n$$\n\\mathbf{X} \\approx \\mathbf{\\alpha} \\cdot \\mathbf{\\beta} \\cdot \\mathbf{X} \\cdot \\mathbf{\\theta} \\cdot \\mathbf{\\gamma}\n$$\n\nThis implementation uses JAX's automatic differentiation and optimization tools to efficiently solve these problems on GPUs. It also incorporates several advanced enhancements:\n\n1. **Strategic initialization methods** including directional initialization, k-means++ style, and convex hull approximation\n2. **Intelligent regularization techniques** to promote interpretable weight distributions\n3. **Advanced projection methods** including adaptive convex boundary approximation (CBAP)\n4. **Sophisticated numerical stability safeguards** throughout the optimization process\n5. **Comprehensive trajectory tracking** for monitoring convergence dynamics\n\n## Contributing\n\nContributions are welcome and highly encouraged! Before submitting a pull request, please review the following resources:\n\n- [Code of Conduct](CODE_OF_CONDUCT.md): Guidelines for community participation\n- [Security Policy](SECURITY.md): Vulnerability reporting and handling procedures\n\nTo contribute to the project:\n\n1. Fork the repository\n2. Create a feature branch (`git checkout -b feature/amazing-feature`)\n3. Commit your changes (`git commit -m 'Add some amazing feature'`)\n4. Push to the branch (`git push origin feature/amazing-feature`)\n5. Open a Pull Request\n\n## Community\n\n- 🐞 [Issues](https://github.com/lv416e/archetypax/issues): Report bugs and request features\n- 💬 [Discussions](https://github.com/lv416e/archetypax/discussions): Questions and general community interactions\n\n## Citation\n\nIf you use this package in your research, please cite:\n\n```\n@software{archetypax2025,\n  author = {mary},\n  title = {archetypax: GPU-accelerated Archetypal Analysis using JAX},\n  year = {2025},\n  url = {https://github.com/lv416e/archetypax}\n}\n```\n\n## License\n\nThis project is licensed under the Apache License 2.0 - see the LICENSE file for details.\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Flv416e%2Farchetypax","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Flv416e%2Farchetypax","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Flv416e%2Farchetypax/lists"}