Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.

Awesome Lists | Featured Topics | Projects

https://github.com/jayanthkoushik/torch-gel

Group elastic net implementation in PyTorch.
https://github.com/jayanthkoushik/torch-gel

elasticnet lasso python pytorch

Last synced: 9 days ago
JSON representation

Group elastic net implementation in PyTorch.

Awesome Lists containing this project

README

        

# torch-gel

This package provides PyTorch implementations to solve the group elastic net
problem. Let _Aj_ (_j = 1 … p_) be feature matrices of sizes _m ×
nj_ (_m_ is the number of samples, and _nj_ is the number
of features in the _j_th group), and let _y_ be an _m × 1_ vector of
the responses. Group elastic net finds coefficients _βj_, and a bias
0_ that solve the optimization problem

> min _β0, …, βp_
> _½ ║y - β0 - ∑ Aj βj2_
> + _m ∑ √nj1║βj║_
> _+ λ2║βj2)._

Here _λ1_ and _λ2_ are scalar coefficients that control
the amount of 2-norm and squared 2-norm regularization. This 2-norm
regularization encourages sparsity at the group level; entire _βj_
might become 0. The squared 2-norm regularization is in similar spirit to
elastic net, and addresses some of the issues of lasso. Note that group elastic
net includes as special cases group lasso (_λ2 = 0_), ridge
regression (_λ1 = 0_), elastic net (each _nj = 1_), and
lasso (each _nj = 1_ and _λ2 = 0_). The optimization
problem is convex, and can be solved efficiently. This package provides two
implementations; one based on proximal gradient descent, and one based on
coordinate descent.

## Installation
Install with `pip`

```bash
pip install torchgel
```

`tqdm` (for progress bars), and numpy are pulled in as dependencies. PyTorch
(`v1.0+`) is also needed, and needs to be installed manually. Refer to the
[PyTorch website]() for instructions.

## Usage
[`examples/main.ipynb`](examples/main.ipynb) is a Jupyter notebook that walks
through using the package for a typical use-case. A more formal description of
the functions follows; and for details about the algorithms, refer to the
docstrings of files in the `gel` directory.

### Solving Single Instances
The modules `gel.gelfista` and `gel.gelcd` provide implementations based on
proximal gradient descent and coordinate descent respectively. Both have similar
interfaces, and expose two main public functions: `make_A` and `gel_solve`. The
feature matrices should be stored in a list (say `As`) as PyTorch tensor
matrices, and the responses should be stored in a PyTorch vector (say `y`).
Additionally, the sizes of the groups (_nj_) should be stored in a
vector (say `ns`). First use the `make_A` function to convert the feature
matrices into a suitable format:

```python
A = make_A(As, ns)
```

Then pass `A`, `y` and other required arguments to `gel_solve`. The general
interface is::

```python
b_0, B = gel_solve(A, y, l_1, l_2, ns, **kwargs)
```

`l_1` and `l_2` are floats representing _λ1_ and _λ2_
respectively. The method returns a float `b_0` representing the bias and a
PyTorch matrix `B` holding the other coefficients. `B` has size _p ×_
max_j_ _nj_ with suitable zero padding. The following
sections cover additional details for the specific implementations.

#### Proximal Gradient Descent (FISTA)
The `gel.gelfista` module contains a proximal gradient descent implementation.
It's usage is just as described in the template above. Refer to the docstring
for `gel.gelfista.gel_solve` for details about the other arguments.

#### Coordinate Descent
The `gel.gelcd` module contains a coordinate descent implementation. Its usage
is a bit more involved than the FISTA implementation. Coordinate descent
iteratively solves single blocks (each corresponding to a single
j_). There are multiple solvers provided to solve the individual
blocks. These are the `gel.gelcd.block_solve_*` functions. Refer to their
docstrings for details about their arguments. `gel.gelcd.gel_solve` requires
passing a block solve function and its arguments (as a dictionary). Refer to
its docstring for further details.

### Solution Paths
`gel.gelpaths` provides a wrapper function `gel_paths` to solve the group
elastic net problem for multiple values of the regularization coefficients. It
implements a two-stage process. For a given _λ1_ and _λ2_,
first the group elastic net problem is solved and the feature blocks with
non-zero coefficients is extracted (the support). Then ridge regression models
are learned for each of several provided regularization values. The final model
is summarized using an arbitrary provided summary function, and the summary for
each combination of the regularization values is returned as a dictionary. The
docstring contains more details. `gel.ridgepaths` contains another useful function,
`ridge_paths` which can efficiently solve ridge regression for multiple
regularization values.

## Citation
If you find this code useful in your research, please cite

```
@misc{koushik2017torchgel,
author = {Koushik, Jayanth},
title = {torch-gel},
year = {2017},
publisher = {GitHub},
journal = {GitHub repository},
howpublished = {\url{https://github.com/jayanthkoushik/torch-gel}},
}
```