https://github.com/namkoong-lab/dro

A package of distributionally robust optimization (DRO) methods. Implemented via cvxpy and PyTorch
https://github.com/namkoong-lab/dro

distributionally-robust-optimization machine-learning operations-research trustworthy-ai

Last synced: 6 months ago
JSON representation

A package of distributionally robust optimization (DRO) methods. Implemented via cvxpy and PyTorch

Host: GitHub
URL: https://github.com/namkoong-lab/dro
Owner: namkoong-lab
License: mit
Created: 2023-12-07T06:44:19.000Z (over 2 years ago)
Default Branch: main
Last Pushed: 2025-05-31T07:25:22.000Z (about 1 year ago)
Last Synced: 2025-09-19T15:22:06.638Z (10 months ago)
Topics: distributionally-robust-optimization, machine-learning, operations-research, trustworthy-ai
Language: Jupyter Notebook
Homepage: https://python-dro.org/
Size: 23.1 MB
Stars: 136
Watchers: 4
Forks: 7
Open Issues: 1
Metadata Files:
- Readme: README.md
- License: LICENSE

Awesome Lists containing this project

README

          [![License: MIT](https://img.shields.io/badge/License-MIT-blue.svg?color=g&style=plastic)](https://opensource.org/licenses/MIT)

[![Downloads](https://static.pepy.tech/personalized-badge/dro?period=total&units=abbreviation&left_color=grey&right_color=blue&left_text=Downloads)](https://pepy.tech/project/dro)

[![pypy: v](https://img.shields.io/pypi/v/dro.svg)](https://pypi.python.org/pypi/dro/)

[![codecov](https://codecov.io/github/namkoong-lab/dro/graph/badge.svg?token=QFYLOVYZXF)](https://codecov.io/github/namkoong-lab/dro)

# `dro`: A Python Package for Distributionally Robust Optimization in Machine Learning

> Jiashuo Liu^†, Tianyu Wang^†, Henry Lam, Hongseok Namkoong, Jose Blanchet  

> _{† equal contributions (α-β order)}



`dro` is a python package that implements typical DRO methods on linear loss (SVM, logistic regression, and linear regression) for supervised learning tasks. It is built based on the convex optimization solver `cvxpy`. The `dro` package supports different kinds of distance metrics $d(\cdot,\cdot)$ as well as different kinds of base models (e.g., linear regression, logistic regression, SVM, neural networks...). Furthermore, it integrates different synthetic data generating mechanisms from recent research papers. 

Without specified, our DRO model is to solve the following optimization problem:

$$\min_{\theta} \max_{P: P \in U} E_{(X,Y) \sim P}[\ell(\theta;(X, Y))],$$

where $U$ is the so-called ambiguity set and typically of the form $U = \\{P: d(P, \hat P) \leq \epsilon\\}$ and $\hat P := \frac{1}{n}\sum_{i = 1}^n \delta_{(X_i, Y_i)}$ is the empirical distribution of training samples $\{(X_i, Y_i)\}_{i = 1}^n$. And $\epsilon$ is the hyperparameter. 

Please refer to our paper for more details. 

```bibtex

@misc{liu2025dropythonlibrary,

      title={DRO: A Python Library for Distributionally Robust Optimization in Machine Learning}, 

      author={Jiashuo Liu and Tianyu Wang and Henry Lam and Hongseok Namkoong and Jose Blanchet},

      year={2025},

      eprint={2505.23565},

      archivePrefix={arXiv},

      primaryClass={cs.LG},

      url={https://arxiv.org/abs/2505.23565}, 

}

```

## Installation

### (1) Install `dro` package

To install `dro` package, you can simply run:

```

pip install dro

```

And it will install all required packages.

### (2) Optional: Prepare `MOSEK` license

When implementing the solvers with exact solutions, our package is built upon `cvxpy`, and called based on `MOSEK` by default, which is efficient for all optimization problems encountered in the reformulation. `MOSEK` needs the license file. The steps are as follows:

* Request license at Official Website, and then the license `mosek.lic` will be emailed to you.

* Put your license in your home directory as follows:

    ```

    cd

    mkdir mosek

    mv /path_to_license/mosek.lic  mosek/

    ```

Alternatively, we can set the solver among some open-source solvers such as `ECOS`, `SCS` in `cvxpy` (see here for more details). In any given DRO model, this can be done during initialization:

    ```

    model = XXDRO(..., solver = 'ECOS')

    ```

by simply updating after initialization:

    ```

    model.solver = 'ECOS'

    ```

These solvers can solve all the optimization problems implemented in the package as well.

## Quick Start

A simple user example is as follows:

```python

from dro.data.dataloader_classification import classification_basic

from dro.data.draw_utils import draw_classification

from dro.linear_model.chi2_dro import Chi2DRO

# Data generating

X, y = classification_basic(d = 2, num_samples = 100, radius = 2, visualize = True)

# Chi-square DRO 

clf_model = Chi2DRO(input_dim=2, model_type = 'logistic')

clf_model.update({'eps': 0.1})

print(clf_model.fit(X, y))

```

For more examples, please refer to our examples.

## Documentation \& APIs

As for the latest `v0.3.1` version, `dro` supports:

### (1) Synthetic data generation

  

    Python Module

    Function Name

    Description

  

  

    



dro.data.dataloader_classification

    classification_basic

    Basic classification task

  

  

    classification_DN21

    Following Section 3.1.1 of 
[1]

  

  

    classification_SNVD20

    Following Section 5.1 of 
[2]

  

  

    classification_LWLC

    Following Section 4.1 (Classification) of 
[3]"

  

  

    




dro.data.dataloader_regression

    regression_basic

    Basic regression task

  

  

    regression_DN20_1

    Following Section 3.1.2 of 
[1]

  

  

    regression_DN20_2

    Following Section 3.1.3 of 
[1]

  

  

    regression_DN20_3

    Following Section 3.3 of 
[1]

  

  

    regression_LWLC

    Following Section 4.1 (Regression) 
of [3]

  

### (2) Linear DRO models

The models listed below are solved by exact solvers from ``cvxpy``.

  

    Python Module

    Class Name

    Description

  

  

    dro.linear_dro.base

    BaseLinearDRO

    Base class for linear DRO methods

  

  

    dro.linear_dro.chi2_dro

    Chi2DRO

    Linear chi-square divergence-based DRO

  

  

    dro.linear_dro.kl_dro

    KLDRO

    Kullback-Leibler divergence-based DRO

  

  

    dro.linear_dro.cvar_dro

    CVaRDRO

    CVaR DRO

  

  

    dro.linear_dro.tv_dro

    TVDRO

    Total Variation DRO

  

  

    dro.linear_dro.marginal_dro

    MarginalCVaRDRO

    Marginal-X CVaR DRO

  

  

    dro.linear_dro.mmd_dro

    MMD_DRO

    Maximum Mean Discrepancy DRO

  

  

    dro.linear_dro.conditional_dro

    ConditionalCVaRDRO

    Y|X (ConditionalShiftBased) CVaR DRO

  

  

    dro.linear_dro.hr_dro

    HR_DRO_LR

    Holistic Robust DRO on linear models

  

  

    

dro.linear_dro.wasserstein_dro

    WassersteinDRO

    Wasserstein DRO

  

  

    WassersteinDROsatisficing

    Robust satisficing version of Wasserstein DRO

  

  

    dro.linear_dro.sinkhorn_dro

    SinkhornLinearDRO

    Sinkhorn DRO on linear models

  

  

    dro.linear_dro.mot_dro

    MOTDRO

    Optimal Transport DRO with Conditional Moment Constraints

  

  

    dro.linear_dro.or_wasserstein_dro

    ORWDRO

    Outlier-Robust Wasserstein DRO

  

### (3) NN DRO models

The models listed below are solved by gradient descent (``Pytorch``).

  

    Python Module

    Class Name

    Description

  

  

    dro.neural_model.base_nn

    BaseNNDRO

    Base model for neural-network-based DRO

  

  

    dro.neural_model.fdro_nn

    Chi2NNDRO

    Chi-square Divergence-based Neural DRO Model

  

  

    dro.neural_model.wdro_nn

    WNNDRO

    Wasserstein Neural DRO with Adversarial Robustness.

  

  

    dro.neural_model.hrdro_nn

    HRNNDRO

    Holistic Robust NN DRO

  

### (4) Tree-based Ensemble DRO models

The models listed below are solved by function approximation (``xgboost``, ``lightgbm``).

  

    Python Module

    Class Name

    Description

  

    

dro.tree_model.lgbm

    KLDRO_LGBM

    KL Divergence-based Robust LightGBM

  

  

    CVaRDRO_LGBM

    CVaR Robust LightGBM

  

    Chi2DRO_LGBM

    Chi2 Divergence-based Robust LightGBM

  

    

dro.tree_model.xgb

    KLDRO_XGB

    KL Divergence-based Robust XGBoost

  

  

    Chi2DRO_XGB

    Chi2 Divergence-based Robust XGBoost

  

  

    CVaRDRO_XGB

    CVaR Robust XGBoost

  

### (5) Model-based Diagnostics

In linear DRO models, we provide additional interfaces for understanding the worst-case model performance and evaluating the true model performance. 

  

    Python Module

    Function Name

    Description

  

  

    

    .worst_distribution

    the worst case distribution of the DRO model

  

  

    

    .evaluate

    true out-of-sample model performance of the DRO model

  

For more details, please refer to https://python-dro.org for more details!

ps: our logo is generated via GPT:)

### Other Reference

[1] Learning Models with Uniform Performance via Distributionally Robust Optimization. Annals of Statistics. Annals of Statistics. 2021.

[2] Certifying Some Distributional Robustness with Principled Adversarial Training. ICLR 2018.

[3] Distributionally Robust Optimization with Data Geometry. NeurIPS 2022.

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/namkoong-lab/dro

Awesome Lists containing this project

README