https://github.com/finite-sample/pareto-deployment

Pareto Deployments of ML Models
https://github.com/finite-sample/pareto-deployment

Last synced: 11 months ago
JSON representation

Pareto Deployments of ML Models

Host: GitHub
URL: https://github.com/finite-sample/pareto-deployment
Owner: finite-sample
Created: 2025-06-29T02:58:32.000Z (12 months ago)
Default Branch: main
Last Pushed: 2025-06-29T03:25:49.000Z (12 months ago)
Last Synced: 2025-06-29T04:18:25.213Z (12 months ago)
Language: Jupyter Notebook
Size: 10.7 KB
Stars: 1
Watchers: 1
Forks: 0
Open Issues: 0
Metadata Files:
- Readme: README.md

Awesome Lists containing this project

README

## Pareto Deployments in Machine Learning

A common problem in machine learning deployment is regression: updating a model improves some cases but introduces new errors on cases previously handled correctly. The usual practice—replacing the old model with a new one if it improves average accuracy—accepts these trade-offs as a fact of life. But engineering, at its best, is about making progress without backsliding.

**Can we do better?** In this experiment, we compare several strategies for deploying a new model while explicitly seeking Pareto improvements: gains without regressions.

---

### Methods

We evaluated four approaches on the Adult income dataset:

1. **Model A only:** The original logistic regression model, applied to all cases.
2. **Model B only:** A new neural network, applied universally.
3. **k-Nearest Neighbors (kNN) Routing:** For each example, choose A or B based on which model performs better among similar cases in feature space.
4. **Meta-classifier Routing:** Use a classifier trained on the validation set to predict, from features, whether A or B is more likely to be correct for each example.

For each deployment, we report overall accuracy and the number of regressions compared to Model A—that is, the count of test cases where Model A was correct but the new strategy is wrong.

---

### Results

| Strategy | Accuracy | Regressions vs. A |
|:---------|:---------|:------------------|
| Model A | 0.8459 | 0 |
| Model B | 0.8396 | 368 |
| kNN | 0.8501 | 15 |
| Meta | 0.8802 | 0 |

- **Model B** improves on average in some regions but regresses on 368 examples that Model A got right.
- **kNN routing** achieves a modest accuracy gain while introducing only 15 regressions.
- **Meta-classifier routing** attains the highest accuracy (0.8802) and makes no new errors compared to Model A—a true Pareto improvement.

---

### Discussion

The results demonstrate that simple routing methods can achieve global improvement without backsliding. The kNN approach provides a modest improvement with very limited regression. The meta-classifier, by learning from the feature space where each model excels, realizes substantial gains while preserving all of A’s strengths.

This pattern generalizes: Pareto deployment is not just possible in theory, but practical in real data. It reflects the principle that progress in engineering should be cumulative—building on what works, not discarding it.

However, these results also carry a note of caution. Achieving strict Pareto improvements depends on having rich enough data to reliably learn the routing function, and the margin for improvement shrinks as models become more similar or as noise increases. The method is most effective when the models have complementary strengths and enough examples are available to learn when to trust each.

---

### Conclusion

Averages can mislead: a model that’s “better on average” may still introduce costly new errors. By making use of routing—whether by local neighborhood or meta-classifier—we can deploy models that genuinely move forward, not sideways. This is not just an aspiration for safety-critical applications, but a practical tool for any deployment where consistency matters.

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/finite-sample/pareto-deployment

Awesome Lists containing this project

README