https://github.com/plexe-ai/plexe
✨ Build a machine learning model from a prompt
https://github.com/plexe-ai/plexe
agentic-ai agents ai machine-learning ml mlengineering mlops multiagent
Last synced: 3 days ago
JSON representation
✨ Build a machine learning model from a prompt
- Host: GitHub
- URL: https://github.com/plexe-ai/plexe
- Owner: plexe-ai
- License: apache-2.0
- Created: 2025-01-05T18:34:25.000Z (about 1 year ago)
- Default Branch: main
- Last Pushed: 2026-02-05T15:47:53.000Z (7 days ago)
- Last Synced: 2026-02-06T01:50:16.481Z (6 days ago)
- Topics: agentic-ai, agents, ai, machine-learning, ml, mlengineering, mlops, multiagent
- Language: Python
- Homepage: https://plexe.ai
- Size: 7.39 MB
- Stars: 2,471
- Watchers: 30
- Forks: 240
- Open Issues: 20
-
Metadata Files:
- Readme: README.md
- Contributing: CONTRIBUTING.md
- License: LICENSE
- Code of conduct: CODE_OF_CONDUCT.md
- Security: SECURITY.md
Awesome Lists containing this project
- awesome-ai-agents - plexe-ai/plexe - Plexe is a machine learning framework that enables users to build and train models using natural language prompts and an AI-powered multi-agent system. (AI Agent Frameworks & SDKs / Multi-Agent Collaboration Systems)
README
# plexe ✨
[](https://pypi.org/project/plexe/)
[](https://discord.gg/SefZDepGMv)

Build machine learning models using natural language.
[Quickstart](#1-quickstart) |
[Features](#2-features) |
[Installation](#3-installation) |
[Documentation](#4-documentation)
**plexe** lets you create machine learning models by describing them in plain language. Simply explain what you want,
provide a dataset, and the AI-powered system builds a fully functional model through an automated agentic approach.
Also available as a [managed cloud service](https://plexe.ai).
Watch the demo on YouTube:
[](https://www.youtube.com/watch?v=bUwCSglhcXY)
## 1. Quickstart
### Installation
```bash
pip install plexe
export OPENAI_API_KEY=
export ANTHROPIC_API_KEY=
```
### Using plexe
Provide a tabular dataset (Parquet, CSV, ORC, or Avro) and a natural language intent:
```bash
python -m plexe.main \
--train-dataset-uri data.parquet \
--intent "predict whether a passenger was transported" \
--max-iterations 5
```
```python
from plexe.main import main
from pathlib import Path
best_solution, metrics, report = main(
intent="predict whether a passenger was transported",
data_refs=["train.parquet"],
max_iterations=5,
work_dir=Path("./workdir"),
)
print(f"Performance: {best_solution.performance:.4f}")
```
## 2. Features
### 2.1. 🤖 Multi-Agent Architecture
The system uses 14 specialized AI agents across a 6-phase workflow to:
- Analyze your data and identify the ML task
- Select the right evaluation metric
- Search for the best model through hypothesis-driven iteration
- Evaluate model performance and robustness
- Package the model for deployment
### 2.2. 🎯 Automated Model Building
Build complete models with a single call. Plexe supports **XGBoost**, **CatBoost**, and **Keras** for tabular data:
```python
best_solution, metrics, report = main(
intent="predict house prices based on property features",
data_refs=["housing.parquet"],
max_iterations=10, # Search iterations
allowed_model_types=["xgboost"], # Or let plexe choose
enable_final_evaluation=True, # Evaluate on held-out test set
)
```
Run `python -m plexe.main --help` for all CLI options.
The output is a self-contained model package at `work_dir/model/` (also archived as `model.tar.gz`).
The package has no dependency on `plexe` — build the model with plexe, deploy it anywhere:
```
model/
├── artifacts/ # Trained model + feature pipeline (pickle)
├── src/ # Inference predictor, pipeline code, training template
├── schemas/ # Input/output JSON schemas
├── config/ # Hyperparameters
├── evaluation/ # Metrics and detailed analysis reports
├── model.yaml # Model metadata
└── README.md # Usage instructions with example code
```
### 2.3. 🐳 Batteries-Included Docker Images
Run plexe with everything pre-configured — PySpark, Java, and all dependencies included.
A `Makefile` is provided for common workflows:
```bash
make build # Build the Docker image
make test-quick # Fast sanity check (~1 iteration)
make run-titanic # Run on Spaceship Titanic dataset
```
Or run directly:
```bash
docker run --rm \
-e OPENAI_API_KEY=$OPENAI_API_KEY \
-e ANTHROPIC_API_KEY=$ANTHROPIC_API_KEY \
-v $(pwd)/data:/data -v $(pwd)/workdir:/workdir \
plexe:py3.12 python -m plexe.main \
--train-dataset-uri /data/dataset.parquet \
--intent "predict customer churn" \
--work-dir /workdir \
--spark-mode local
```
A `config.yaml` in the project root is automatically mounted. A Databricks Connect image
is also available: `docker build --target databricks .`
### 2.4. ⚙️ YAML Configuration
Customize LLM routing, search parameters, Spark settings, and more via a config file:
```yaml
# config.yaml
max_search_iterations: 5
allowed_model_types: [xgboost, catboost]
spark_driver_memory: "4g"
hypothesiser_llm: "openai/gpt-5-mini"
feature_processor_llm: "anthropic/claude-sonnet-4-5-20250929"
```
```bash
CONFIG_FILE=config.yaml python -m plexe.main ...
```
See [`config.yaml.template`](config.yaml.template) for all available options.
### 2.5. 🌐 Multi-Provider LLM Support
Plexe uses LLMs via [LiteLLM](https://docs.litellm.ai/docs/providers), so you can use any supported provider:
```yaml
# Route different agents to different providers
hypothesiser_llm: "openai/gpt-5-mini"
feature_processor_llm: "anthropic/claude-sonnet-4-5-20250929"
model_definer_llm: "ollama/llama3"
```
> [!NOTE]
> Plexe *should* work with most LiteLLM providers, but we actively test only with `openai/*` and `anthropic/*`
> models. If you encounter issues with other providers, please let us know.
### 2.6. 📊 Experiment Dashboard
Visualize experiment results, search trees, and evaluation reports with the built-in Streamlit dashboard:
```bash
python -m plexe.viz --work-dir ./workdir
```
### 2.7. 🔌 Extensibility
Connect plexe to custom storage, tracking, and deployment infrastructure via the `WorkflowIntegration` interface:
```python
main(intent="...", data_refs=[...], integration=MyCustomIntegration())
```
See [`plexe/integrations/base.py`](plexe/integrations/base.py) for the full interface.
## 3. Installation
### 3.1. Installation Options
```bash
pip install plexe # Core (XGBoost, CatBoost, Keras, scikit-learn)
pip install plexe[pyspark] # + Local PySpark execution
pip install plexe[aws] # + S3 storage support (boto3)
```
Requires Python >= 3.10, < 3.13.
### 3.2. API Keys
```bash
export OPENAI_API_KEY=
export ANTHROPIC_API_KEY=
```
See [LiteLLM providers](https://docs.litellm.ai/docs/providers) for all supported providers.
## 4. Documentation
For full documentation, visit [docs.plexe.ai](https://docs.plexe.ai).
## 5. Contributing
See [CONTRIBUTING.md](CONTRIBUTING.md) for guidelines. Join our [Discord](https://discord.gg/SefZDepGMv) to connect with the team.
## 6. License
[Apache-2.0 License](LICENSE)
## 7. Citation
If you use Plexe in your research, please cite it as follows:
```bibtex
@software{plexe2025,
author = {De Bernardi, Marcello AND Dubey, Vaibhav},
title = {Plexe: Build machine learning models using natural language.},
year = {2025},
publisher = {GitHub},
howpublished = {\url{https://github.com/plexe-ai/plexe}},
}
```