https://github.com/nignatiadis/sigmaridgeregression.jl

Optimally tuned ridge regression when features can be partitioned into groups.
https://github.com/nignatiadis/sigmaridgeregression.jl

Last synced: 3 months ago
JSON representation

Optimally tuned ridge regression when features can be partitioned into groups.

Host: GitHub
URL: https://github.com/nignatiadis/sigmaridgeregression.jl
Owner: nignatiadis
License: mit
Created: 2020-01-27T01:35:43.000Z (over 5 years ago)
Default Branch: master
Last Pushed: 2021-03-26T00:17:21.000Z (over 4 years ago)
Last Synced: 2025-02-04T23:48:09.111Z (5 months ago)
Language: Julia
Homepage: https://arxiv.org/pdf/2010.15817.pdf
Size: 7.61 MB
Stars: 5
Watchers: 2
Forks: 0
Open Issues: 1
Metadata Files:
- Readme: README.md
- License: LICENSE

Awesome Lists containing this project

README

        # SigmaRidgeRegression.jl



[![Build Status](https://github.com/nignatiadis/SigmaRidgeRegression.jl/workflows/CI/badge.svg)](https://github.com/nignatiadis/SigmaRidgeRegression.jl/actions)

[![Coverage](https://codecov.io/gh/nignatiadis/SigmaRidgeRegression.jl/branch/master/graph/badge.svg)](https://codecov.io/gh/nignatiadis/SigmaRidgeRegression.jl)

Automatically and optimally-tuned Ridge regression when the features may be partitioned into groups.

See the manuscript below for a theoretical description of the method.

>  Ignatiadis, Nikolaos, and Panagiotis Lolas. "σ-Ridge: group regularized ridge regression via empirical Bayes noise level cross-validation." [arXiv:2010.15817](https://arxiv.org/abs/2010.15817) (2020+)

The folder `reproduction_code` in this repository contains code to reproduce the results of the paper.

## Installation

The package is available on the Julia registry (for Julia version 1.5) and may be installed as follows:

```julia

using Pkg

Pkg.add("SigmaRidgeRegression")

```

## Example usage

SigmaRidgeRegression.jl can be used alongside the [MLJ](https://github.com/alan-turing-institute/MLJ.jl) framework for machine learning in Julia.

```julia

using MLJ

using SigmaRidgeRegression

using Random

# Suppose we have three groups of features, each with n observations

# and 25, 50 and 100 features respectively

n = 400

Random.seed!(1)

p1 = 25 ; X1 = randn(n, p1)

p2 = 50 ; X2 = randn(n, p2)

p3 = 100; X3 = randn(n, p3)

# The signal in the regression of the coefficients across these groups varies

α1_sq = 4.0 ;  βs1 = randn(p1) .* sqrt(α1_sq / p1)

α2_sq = 8.0 ;  βs2 = randn(p2) .* sqrt(α2_sq / p2)

α3_sq = 12.0;  βs3 = randn(p3) .* sqrt(α3_sq / p3)

# Let us concatenate the results and create a response

X = [X1 X2 X3]

βs = [βs1; βs2; βs3]

σ = 4.0

Y = X*βs .+ σ .* randn(n)

# Let us make a `GroupedFeatures` object that describes the feature grouping

# !!NOTE!! Right now the features are expected to be ordered consecutively in groups

# i.e., the first p1 features belong to group 1 etc.

groups = GroupedFeatures([p1;p2;p3])

# Create MLJ machine and fit SigmaRidgeRegression:

sigma_model = LooSigmaRidgeRegressor(;groups=groups)

mach_sigma_model = machine(sigma_model,  MLJ.table(X), Y)

fit!(mach_sigma_model)

# How well are we estimating the true X*βs in mean squared error?

mean(abs2, X*βs .- predict(mach_sigma_model))  # 4.612726430034071

# In this case we may compare also to the Bayes risk

λs_opt = σ^2 ./ [α1_sq; α2_sq; α3_sq] .* groups.ps ./n

bayes = MultiGroupRidgeRegressor(;groups=groups, λs=λs_opt, center=false, scale=false)

mach_bayes = machine(bayes, MLJ.table(X), Y)

fit!(mach_bayes)

mean(abs2, X*βs .- predict(mach_bayes)) #4.356913540118585

```

### TODOs

* Fully implement the MLJ interface.

* Wait for the following MLJ issue to be fixed: https://github.com/alan-turing-institute/MLJBase.jl/issues/428#issuecomment-708141459, in the meantime this package uses type piracy as in the linked comment to accommodate a large number of features.

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/nignatiadis/sigmaridgeregression.jl

Awesome Lists containing this project

README