An open API service indexing awesome lists of open source software.

https://github.com/namkoong-lab/dro

A package of distributionally robust optimization (DRO) methods. Implemented via cvxpy and PyTorch
https://github.com/namkoong-lab/dro

distributionally-robust-optimization machine-learning operations-research trustworthy-ai

Last synced: 5 months ago
JSON representation

A package of distributionally robust optimization (DRO) methods. Implemented via cvxpy and PyTorch

Awesome Lists containing this project

README

          

[![License: MIT](https://img.shields.io/badge/License-MIT-blue.svg?color=g&style=plastic)](https://opensource.org/licenses/MIT)
[![Downloads](https://static.pepy.tech/personalized-badge/dro?period=total&units=abbreviation&left_color=grey&right_color=blue&left_text=Downloads)](https://pepy.tech/project/dro)
[![pypy: v](https://img.shields.io/pypi/v/dro.svg)](https://pypi.python.org/pypi/dro/)
[![codecov](https://codecov.io/github/namkoong-lab/dro/graph/badge.svg?token=QFYLOVYZXF)](https://codecov.io/github/namkoong-lab/dro)

# `dro`: A Python Package for Distributionally Robust Optimization in Machine Learning
> Jiashuo Liu, Tianyu Wang, Henry Lam, Hongseok Namkoong, Jose Blanchet
> † equal contributions (α-β order)

`dro` is a python package that implements typical DRO methods on linear loss (SVM, logistic regression, and linear regression) for supervised learning tasks. It is built based on the convex optimization solver `cvxpy`. The `dro` package supports different kinds of distance metrics $d(\cdot,\cdot)$ as well as different kinds of base models (e.g., linear regression, logistic regression, SVM, neural networks...). Furthermore, it integrates different synthetic data generating mechanisms from recent research papers.

Without specified, our DRO model is to solve the following optimization problem:
$$\min_{\theta} \max_{P: P \in U} E_{(X,Y) \sim P}[\ell(\theta;(X, Y))],$$
where $U$ is the so-called ambiguity set and typically of the form $U = \\{P: d(P, \hat P) \leq \epsilon\\}$ and $\hat P := \frac{1}{n}\sum_{i = 1}^n \delta_{(X_i, Y_i)}$ is the empirical distribution of training samples $\{(X_i, Y_i)\}_{i = 1}^n$. And $\epsilon$ is the hyperparameter.

Please refer to our paper for more details.
```bibtex
@misc{liu2025dropythonlibrary,
title={DRO: A Python Library for Distributionally Robust Optimization in Machine Learning},
author={Jiashuo Liu and Tianyu Wang and Henry Lam and Hongseok Namkoong and Jose Blanchet},
year={2025},
eprint={2505.23565},
archivePrefix={arXiv},
primaryClass={cs.LG},
url={https://arxiv.org/abs/2505.23565},
}
```

## Installation

### (1) Install `dro` package
To install `dro` package, you can simply run:
```
pip install dro
```
And it will install all required packages.

### (2) Optional: Prepare `MOSEK` license
When implementing the solvers with exact solutions, our package is built upon `cvxpy`, and called based on `MOSEK` by default, which is efficient for all optimization problems encountered in the reformulation. `MOSEK` needs the license file. The steps are as follows:

* Request license at Official Website, and then the license `mosek.lic` will be emailed to you.
* Put your license in your home directory as follows:

```
cd
mkdir mosek
mv /path_to_license/mosek.lic mosek/
```

Alternatively, we can set the solver among some open-source solvers such as `ECOS`, `SCS` in `cvxpy` (see here for more details). In any given DRO model, this can be done during initialization:

```
model = XXDRO(..., solver = 'ECOS')
```

by simply updating after initialization:

```
model.solver = 'ECOS'
```

These solvers can solve all the optimization problems implemented in the package as well.

## Quick Start
A simple user example is as follows:
```python
from dro.data.dataloader_classification import classification_basic
from dro.data.draw_utils import draw_classification
from dro.linear_model.chi2_dro import Chi2DRO

# Data generating
X, y = classification_basic(d = 2, num_samples = 100, radius = 2, visualize = True)

# Chi-square DRO
clf_model = Chi2DRO(input_dim=2, model_type = 'logistic')
clf_model.update({'eps': 0.1})
print(clf_model.fit(X, y))
```

For more examples, please refer to our examples.

## Documentation \& APIs

As for the latest `v0.3.1` version, `dro` supports:

### (1) Synthetic data generation


Python Module
Function Name
Description






dro.data.dataloader_classification
classification_basic
Basic classification task


classification_DN21
Following Section 3.1.1 of
[1]


classification_SNVD20
Following Section 5.1 of
[2]


classification_LWLC
Following Section 4.1 (Classification) of
[3]"







dro.data.dataloader_regression
regression_basic
Basic regression task


regression_DN20_1
Following Section 3.1.2 of
[1]


regression_DN20_2
Following Section 3.1.3 of
[1]


regression_DN20_3
Following Section 3.3 of
[1]


regression_LWLC
Following Section 4.1 (Regression)
of [3]

### (2) Linear DRO models
The models listed below are solved by exact solvers from ``cvxpy``.


Python Module
Class Name
Description


dro.linear_dro.base
BaseLinearDRO
Base class for linear DRO methods


dro.linear_dro.chi2_dro
Chi2DRO
Linear chi-square divergence-based DRO


dro.linear_dro.kl_dro
KLDRO
Kullback-Leibler divergence-based DRO


dro.linear_dro.cvar_dro
CVaRDRO
CVaR DRO


dro.linear_dro.tv_dro
TVDRO
Total Variation DRO


dro.linear_dro.marginal_dro
MarginalCVaRDRO
Marginal-X CVaR DRO


dro.linear_dro.mmd_dro
MMD_DRO
Maximum Mean Discrepancy DRO


dro.linear_dro.conditional_dro
ConditionalCVaRDRO
Y|X (ConditionalShiftBased) CVaR DRO


dro.linear_dro.hr_dro
HR_DRO_LR
Holistic Robust DRO on linear models




dro.linear_dro.wasserstein_dro
WassersteinDRO
Wasserstein DRO


WassersteinDROsatisficing
Robust satisficing version of Wasserstein DRO


dro.linear_dro.sinkhorn_dro
SinkhornLinearDRO
Sinkhorn DRO on linear models


dro.linear_dro.mot_dro
MOTDRO
Optimal Transport DRO with Conditional Moment Constraints


dro.linear_dro.or_wasserstein_dro
ORWDRO
Outlier-Robust Wasserstein DRO

### (3) NN DRO models
The models listed below are solved by gradient descent (``Pytorch``).


Python Module
Class Name
Description


dro.neural_model.base_nn
BaseNNDRO
Base model for neural-network-based DRO


dro.neural_model.fdro_nn
Chi2NNDRO
Chi-square Divergence-based Neural DRO Model


dro.neural_model.wdro_nn
WNNDRO
Wasserstein Neural DRO with Adversarial Robustness.


dro.neural_model.hrdro_nn
HRNNDRO
Holistic Robust NN DRO

### (4) Tree-based Ensemble DRO models
The models listed below are solved by function approximation (``xgboost``, ``lightgbm``).


Python Module
Class Name
Description



dro.tree_model.lgbm
KLDRO_LGBM
KL Divergence-based Robust LightGBM


CVaRDRO_LGBM
CVaR Robust LightGBM

Chi2DRO_LGBM
Chi2 Divergence-based Robust LightGBM



dro.tree_model.xgb
KLDRO_XGB
KL Divergence-based Robust XGBoost


Chi2DRO_XGB
Chi2 Divergence-based Robust XGBoost


CVaRDRO_XGB
CVaR Robust XGBoost

### (5) Model-based Diagnostics
In linear DRO models, we provide additional interfaces for understanding the worst-case model performance and evaluating the true model performance.


Python Module
Function Name
Description



.worst_distribution
the worst case distribution of the DRO model



.evaluate
true out-of-sample model performance of the DRO model

For more details, please refer to https://python-dro.org for more details!

ps: our logo is generated via GPT:)

### Other Reference
[1] Learning Models with Uniform Performance via Distributionally Robust Optimization. Annals of Statistics. Annals of Statistics. 2021.

[2] Certifying Some Distributional Robustness with Principled Adversarial Training. ICLR 2018.

[3] Distributionally Robust Optimization with Data Geometry. NeurIPS 2022.