{"id":27602624,"url":"https://github.com/plexe-ai/plexe","last_synced_at":"2026-03-06T13:08:46.310Z","repository":{"id":271270889,"uuid":"912499477","full_name":"plexe-ai/plexe","owner":"plexe-ai","description":"✨ Build a machine learning model from a prompt","archived":false,"fork":false,"pushed_at":"2026-03-03T01:13:01.000Z","size":8965,"stargazers_count":2545,"open_issues_count":13,"forks_count":255,"subscribers_count":31,"default_branch":"main","last_synced_at":"2026-03-03T01:28:05.258Z","etag":null,"topics":["agentic-ai","agents","ai","machine-learning","ml","mlengineering","mlops","multiagent"],"latest_commit_sha":null,"homepage":"https://plexe.ai","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"apache-2.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/plexe-ai.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":"CONTRIBUTING.md","funding":null,"license":"LICENSE","code_of_conduct":"CODE_OF_CONDUCT.md","threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":"SECURITY.md","support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null,"notice":null,"maintainers":null,"copyright":null,"agents":"AGENTS.md","dco":null,"cla":null}},"created_at":"2025-01-05T18:34:25.000Z","updated_at":"2026-03-03T01:05:45.000Z","dependencies_parsed_at":"2026-02-09T13:01:21.328Z","dependency_job_id":null,"html_url":"https://github.com/plexe-ai/plexe","commit_stats":null,"previous_names":["plexe-ai/smolmodels","plexe-ai/plexe"],"tags_count":69,"template":false,"template_full_name":null,"purl":"pkg:github/plexe-ai/plexe","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/plexe-ai%2Fplexe","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/plexe-ai%2Fplexe/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/plexe-ai%2Fplexe/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/plexe-ai%2Fplexe/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/plexe-ai","download_url":"https://codeload.github.com/plexe-ai/plexe/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/plexe-ai%2Fplexe/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":30178303,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-03-06T12:39:21.703Z","status":"ssl_error","status_checked_at":"2026-03-06T12:36:09.819Z","response_time":250,"last_error":"SSL_connect returned=1 errno=0 peeraddr=140.82.121.5:443 state=error: unexpected eof while reading","robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":false,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["agentic-ai","agents","ai","machine-learning","ml","mlengineering","mlops","multiagent"],"created_at":"2025-04-22T18:02:09.804Z","updated_at":"2026-03-06T13:08:46.284Z","avatar_url":"https://github.com/plexe-ai.png","language":"Python","funding_links":[],"categories":["Python","📚 Project Purpose","AI Agent Frameworks \u0026 SDKs"],"sub_categories":["Machine Learning (Intermediate-Level","Multi-Agent Collaboration Systems"],"readme":"\u003cdiv align=\"center\"\u003e\n\n# plexe ✨\n\n[![PyPI version](https://img.shields.io/pypi/v/plexe.svg)](https://pypi.org/project/plexe/)\n[![Discord](https://img.shields.io/discord/1300920499886358529?logo=discord\u0026logoColor=white)](https://discord.gg/SefZDepGMv)\n\n\u003cimg src=\"resources/backed-by-yc.png\" alt=\"backed-by-yc\" width=\"20%\"\u003e\n\n\nBuild machine learning models using natural language.\n\n[Quickstart](#1-quickstart) |\n[Features](#2-features) |\n[Installation](#3-installation) |\n[Documentation](#4-documentation)\n\n\u003cbr\u003e\n\n**plexe** lets you create machine learning models by describing them in plain language. Simply explain what you want,\nprovide a dataset, and the AI-powered system builds a fully functional model through an automated agentic approach.\nAlso available as a [managed cloud service](https://plexe.ai).\n\n\u003cbr\u003e\n\nWatch the demo on YouTube:\n[![Building an ML model with Plexe](resources/demo-thumbnail.png)](https://www.youtube.com/watch?v=bUwCSglhcXY)\n\u003c/div\u003e\n\n## 1. Quickstart\n\n### Installation\n```bash\npip install plexe\nexport OPENAI_API_KEY=\u003cyour-key\u003e\nexport ANTHROPIC_API_KEY=\u003cyour-key\u003e\n```\n\n### Using plexe\n\nProvide a tabular dataset (Parquet, CSV, ORC, or Avro) and a natural language intent:\n\n```bash\npython -m plexe.main \\\n    --train-dataset-uri data.parquet \\\n    --intent \"predict whether a passenger was transported\" \\\n    --max-iterations 5\n```\n\n```python\nfrom plexe.main import main\nfrom pathlib import Path\n\nbest_solution, metrics, report = main(\n    intent=\"predict whether a passenger was transported\",\n    data_refs=[\"train.parquet\"],\n    max_iterations=5,\n    work_dir=Path(\"./workdir\"),\n)\nprint(f\"Performance: {best_solution.performance:.4f}\")\n```\n\n## 2. Features\n\n### 2.1. 🤖 Multi-Agent Architecture\nThe system uses 14 specialized AI agents across a 6-phase workflow to:\n- Analyze your data and identify the ML task\n- Select the right evaluation metric\n- Search for the best model through hypothesis-driven iteration\n- Evaluate model performance and robustness\n- Package the model for deployment\n\n### 2.2. 🎯 Automated Model Building\nBuild complete models with a single call. Plexe supports **XGBoost**, **CatBoost**, **LightGBM**, **Keras**, and **PyTorch** for tabular data:\n\n```python\nbest_solution, metrics, report = main(\n    intent=\"predict house prices based on property features\",\n    data_refs=[\"housing.parquet\"],\n    max_iterations=10,                    # Search iterations\n    allowed_model_types=[\"xgboost\"],      # Or let plexe choose\n    enable_final_evaluation=True,         # Evaluate on held-out test set\n)\n```\n\nRun `python -m plexe.main --help` for all CLI options.\n\nThe output is a self-contained model package at `work_dir/model/` (also archived as `model.tar.gz`).\nThe package has no dependency on `plexe` — build the model with plexe, deploy it anywhere:\n\n```\nmodel/\n├── artifacts/          # Trained model + feature pipeline (pickle)\n├── src/                # Inference predictor, pipeline code, training template\n├── schemas/            # Input/output JSON schemas\n├── config/             # Hyperparameters\n├── evaluation/         # Metrics and detailed analysis reports\n├── model.yaml          # Model metadata\n└── README.md           # Usage instructions with example code\n```\n\n### 2.3. 🐳 Batteries-Included Docker Images\nRun plexe with everything pre-configured — PySpark, Java, and all dependencies included.\nA `Makefile` is provided for common workflows:\n\n```bash\nmake build          # Build the Docker image\nmake test-quick     # Fast sanity check (~1 iteration)\nmake run-titanic    # Run on Spaceship Titanic dataset\n```\n\nOr run directly:\n\n```bash\ndocker run --rm \\\n    -e OPENAI_API_KEY=$OPENAI_API_KEY \\\n    -e ANTHROPIC_API_KEY=$ANTHROPIC_API_KEY \\\n    -v $(pwd)/data:/data -v $(pwd)/workdir:/workdir \\\n    plexe:py3.12 python -m plexe.main \\\n        --train-dataset-uri /data/dataset.parquet \\\n        --intent \"predict customer churn\" \\\n        --work-dir /workdir \\\n        --spark-mode local\n```\n\nA `config.yaml` in the project root is automatically mounted. A Databricks Connect image\nis also available: `docker build --target databricks .`\n\n### 2.4. ⚙️ YAML Configuration\nCustomize LLM routing, search parameters, Spark settings, and more via a config file:\n\n```yaml\n# config.yaml\nmax_search_iterations: 5\nallowed_model_types: [xgboost, catboost]\nspark_driver_memory: \"4g\"\nhypothesiser_llm: \"openai/gpt-5-mini\"\nfeature_processor_llm: \"anthropic/claude-sonnet-4-5-20250929\"\n```\n\n```bash\nCONFIG_FILE=config.yaml python -m plexe.main ...\n```\n\nSee [`config.yaml.template`](config.yaml.template) for all available options.\n\n### 2.5. 🌐 Multi-Provider LLM Support\nPlexe uses LLMs via [LiteLLM](https://docs.litellm.ai/docs/providers), so you can use any supported provider:\n\n```yaml\n# Route different agents to different providers\nhypothesiser_llm: \"openai/gpt-5-mini\"\nfeature_processor_llm: \"anthropic/claude-sonnet-4-5-20250929\"\nmodel_definer_llm: \"ollama/llama3\"\n```\n\n\u003e [!NOTE]\n\u003e Plexe *should* work with most LiteLLM providers, but we actively test only with `openai/*` and `anthropic/*`\n\u003e models. If you encounter issues with other providers, please let us know.\n\n### 2.6. 📊 Experiment Dashboard\nVisualize experiment results, search trees, and evaluation reports with the built-in Streamlit dashboard:\n\n```bash\npython -m plexe.viz --work-dir ./workdir\n```\n\n### 2.7. 🔌 Extensibility\nConnect plexe to custom storage, tracking, and deployment infrastructure via the `WorkflowIntegration` interface:\n\n```python\nmain(intent=\"...\", data_refs=[...], integration=MyCustomIntegration())\n```\n\nSee [`plexe/integrations/base.py`](plexe/integrations/base.py) for the full interface.\n\n## 3. Installation\n\n### 3.1. Installation Options\n```bash\npip install plexe                    # Core (XGBoost, Keras, scikit-learn)\n```\n\nYou can add optional dependencies either by framework or by task grouping:\n- Framework extras: `catboost`, `lightgbm`, `pytorch`\n- Task extras: `tabular` (CatBoost + LightGBM), `vision` (PyTorch)\n- Platform extras: `pyspark`, `aws`\n\nExamples:\n```bash\npip install \"plexe[tabular,pyspark]\"   # tabular stack + local PySpark\npip install \"plexe[pytorch,aws]\"       # explicit framework + S3 support\n```\n\nRequires Python \u003e= 3.10, \u003c 3.13.\n\n### 3.2. API Keys\n```bash\nexport OPENAI_API_KEY=\u003cyour-key\u003e\nexport ANTHROPIC_API_KEY=\u003cyour-key\u003e\n```\nSee [LiteLLM providers](https://docs.litellm.ai/docs/providers) for all supported providers.\n\n## 4. Documentation\nFor full documentation, visit [docs.plexe.ai](https://docs.plexe.ai).\n\n## 5. Contributing\nSee [CONTRIBUTING.md](CONTRIBUTING.md) for guidelines. Join our [Discord](https://discord.gg/SefZDepGMv) to connect with the team.\n\n## 6. License\n[Apache-2.0 License](LICENSE)\n\n## 7. Citation\nIf you use Plexe in your research, please cite it as follows:\n\n```bibtex\n@software{plexe2025,\n  author = {De Bernardi, Marcello AND Dubey, Vaibhav},\n  title = {Plexe: Build machine learning models using natural language.},\n  year = {2025},\n  publisher = {GitHub},\n  howpublished = {\\url{https://github.com/plexe-ai/plexe}},\n}\n```\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fplexe-ai%2Fplexe","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fplexe-ai%2Fplexe","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fplexe-ai%2Fplexe/lists"}