https://github.com/namkoong-lab/dro
A package of distributionally robust optimization (DRO) methods. Implemented via cvxpy and PyTorch
https://github.com/namkoong-lab/dro
distributionally-robust-optimization machine-learning operations-research trustworthy-ai
Last synced: 5 months ago
JSON representation
A package of distributionally robust optimization (DRO) methods. Implemented via cvxpy and PyTorch
- Host: GitHub
- URL: https://github.com/namkoong-lab/dro
- Owner: namkoong-lab
- License: mit
- Created: 2023-12-07T06:44:19.000Z (over 2 years ago)
- Default Branch: main
- Last Pushed: 2025-05-31T07:25:22.000Z (about 1 year ago)
- Last Synced: 2025-09-19T15:22:06.638Z (9 months ago)
- Topics: distributionally-robust-optimization, machine-learning, operations-research, trustworthy-ai
- Language: Jupyter Notebook
- Homepage: https://python-dro.org/
- Size: 23.1 MB
- Stars: 136
- Watchers: 4
- Forks: 7
- Open Issues: 1
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
[](https://opensource.org/licenses/MIT)
[](https://pepy.tech/project/dro)
[](https://pypi.python.org/pypi/dro/)
[](https://codecov.io/github/namkoong-lab/dro)
# `dro`: A Python Package for Distributionally Robust Optimization in Machine Learning
> Jiashuo Liu†, Tianyu Wang†, Henry Lam, Hongseok Namkoong, Jose Blanchet
> † equal contributions (α-β order)

`dro` is a python package that implements typical DRO methods on linear loss (SVM, logistic regression, and linear regression) for supervised learning tasks. It is built based on the convex optimization solver `cvxpy`. The `dro` package supports different kinds of distance metrics $d(\cdot,\cdot)$ as well as different kinds of base models (e.g., linear regression, logistic regression, SVM, neural networks...). Furthermore, it integrates different synthetic data generating mechanisms from recent research papers.
Without specified, our DRO model is to solve the following optimization problem:
$$\min_{\theta} \max_{P: P \in U} E_{(X,Y) \sim P}[\ell(\theta;(X, Y))],$$
where $U$ is the so-called ambiguity set and typically of the form $U = \\{P: d(P, \hat P) \leq \epsilon\\}$ and $\hat P := \frac{1}{n}\sum_{i = 1}^n \delta_{(X_i, Y_i)}$ is the empirical distribution of training samples $\{(X_i, Y_i)\}_{i = 1}^n$. And $\epsilon$ is the hyperparameter.
Please refer to our paper for more details.
```bibtex
@misc{liu2025dropythonlibrary,
title={DRO: A Python Library for Distributionally Robust Optimization in Machine Learning},
author={Jiashuo Liu and Tianyu Wang and Henry Lam and Hongseok Namkoong and Jose Blanchet},
year={2025},
eprint={2505.23565},
archivePrefix={arXiv},
primaryClass={cs.LG},
url={https://arxiv.org/abs/2505.23565},
}
```
## Installation
### (1) Install `dro` package
To install `dro` package, you can simply run:
```
pip install dro
```
And it will install all required packages.
### (2) Optional: Prepare `MOSEK` license
When implementing the solvers with exact solutions, our package is built upon `cvxpy`, and called based on `MOSEK` by default, which is efficient for all optimization problems encountered in the reformulation. `MOSEK` needs the license file. The steps are as follows:
* Request license at Official Website, and then the license `mosek.lic` will be emailed to you.
* Put your license in your home directory as follows:
```
cd
mkdir mosek
mv /path_to_license/mosek.lic mosek/
```
Alternatively, we can set the solver among some open-source solvers such as `ECOS`, `SCS` in `cvxpy` (see here for more details). In any given DRO model, this can be done during initialization:
```
model = XXDRO(..., solver = 'ECOS')
```
by simply updating after initialization:
```
model.solver = 'ECOS'
```
These solvers can solve all the optimization problems implemented in the package as well.
## Quick Start
A simple user example is as follows:
```python
from dro.data.dataloader_classification import classification_basic
from dro.data.draw_utils import draw_classification
from dro.linear_model.chi2_dro import Chi2DRO
# Data generating
X, y = classification_basic(d = 2, num_samples = 100, radius = 2, visualize = True)
# Chi-square DRO
clf_model = Chi2DRO(input_dim=2, model_type = 'logistic')
clf_model.update({'eps': 0.1})
print(clf_model.fit(X, y))
```
For more examples, please refer to our examples.
## Documentation \& APIs
As for the latest `v0.3.1` version, `dro` supports:
### (1) Synthetic data generation
Python Module
Function Name
Description
dro.data.dataloader_classification
classification_basic
Basic classification task
classification_DN21
Following Section 3.1.1 of
[1]
classification_SNVD20
Following Section 5.1 of
[2]
classification_LWLC
Following Section 4.1 (Classification) of
[3]"
dro.data.dataloader_regression
regression_basic
Basic regression task
regression_DN20_1
Following Section 3.1.2 of
[1]
regression_DN20_2
Following Section 3.1.3 of
[1]
regression_DN20_3
Following Section 3.3 of
[1]
regression_LWLC
Following Section 4.1 (Regression)
of [3]
### (2) Linear DRO models
The models listed below are solved by exact solvers from ``cvxpy``.
Python Module
Class Name
Description
dro.linear_dro.base
BaseLinearDRO
Base class for linear DRO methods
dro.linear_dro.chi2_dro
Chi2DRO
Linear chi-square divergence-based DRO
dro.linear_dro.kl_dro
KLDRO
Kullback-Leibler divergence-based DRO
dro.linear_dro.cvar_dro
CVaRDRO
CVaR DRO
dro.linear_dro.tv_dro
TVDRO
Total Variation DRO
dro.linear_dro.marginal_dro
MarginalCVaRDRO
Marginal-X CVaR DRO
dro.linear_dro.mmd_dro
MMD_DRO
Maximum Mean Discrepancy DRO
dro.linear_dro.conditional_dro
ConditionalCVaRDRO
Y|X (ConditionalShiftBased) CVaR DRO
dro.linear_dro.hr_dro
HR_DRO_LR
Holistic Robust DRO on linear models
dro.linear_dro.wasserstein_dro
WassersteinDRO
Wasserstein DRO
WassersteinDROsatisficing
Robust satisficing version of Wasserstein DRO
dro.linear_dro.sinkhorn_dro
SinkhornLinearDRO
Sinkhorn DRO on linear models
dro.linear_dro.mot_dro
MOTDRO
Optimal Transport DRO with Conditional Moment Constraints
dro.linear_dro.or_wasserstein_dro
ORWDRO
Outlier-Robust Wasserstein DRO
### (3) NN DRO models
The models listed below are solved by gradient descent (``Pytorch``).
Python Module
Class Name
Description
dro.neural_model.base_nn
BaseNNDRO
Base model for neural-network-based DRO
dro.neural_model.fdro_nn
Chi2NNDRO
Chi-square Divergence-based Neural DRO Model
dro.neural_model.wdro_nn
WNNDRO
Wasserstein Neural DRO with Adversarial Robustness.
dro.neural_model.hrdro_nn
HRNNDRO
Holistic Robust NN DRO
### (4) Tree-based Ensemble DRO models
The models listed below are solved by function approximation (``xgboost``, ``lightgbm``).
Python Module
Class Name
Description
dro.tree_model.lgbm
KLDRO_LGBM
KL Divergence-based Robust LightGBM
CVaRDRO_LGBM
CVaR Robust LightGBM
Chi2DRO_LGBM
Chi2 Divergence-based Robust LightGBM
dro.tree_model.xgb
KLDRO_XGB
KL Divergence-based Robust XGBoost
Chi2DRO_XGB
Chi2 Divergence-based Robust XGBoost
CVaRDRO_XGB
CVaR Robust XGBoost
### (5) Model-based Diagnostics
In linear DRO models, we provide additional interfaces for understanding the worst-case model performance and evaluating the true model performance.
Python Module
Function Name
Description
.worst_distribution
the worst case distribution of the DRO model
.evaluate
true out-of-sample model performance of the DRO model
For more details, please refer to https://python-dro.org for more details!
ps: our logo is generated via GPT:)
### Other Reference
[1] Learning Models with Uniform Performance via Distributionally Robust Optimization. Annals of Statistics. Annals of Statistics. 2021.
[2] Certifying Some Distributional Robustness with Principled Adversarial Training. ICLR 2018.
[3] Distributionally Robust Optimization with Data Geometry. NeurIPS 2022.