Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.
Awesome Lists | Featured Topics | Projects
https://github.com/jorischau/gslnls

{gslnls}: GSL multi-start nonlinear least-squares fitting in R
https://github.com/jorischau/gslnls
gnu-scientific-library gsl levenberg-marquardt multi-start nonlinear-least-squares nonlinear-regression r r-package
Last synced: about 1 month ago
JSON representation
{gslnls}: GSL multi-start nonlinear least-squares fitting in R
Host: GitHub
URL: https://github.com/jorischau/gslnls
Owner: JorisChau
Created: 2021-09-12T10:00:33.000Z (over 3 years ago)
Default Branch: master
Last Pushed: 2024-05-15T12:19:06.000Z (7 months ago)
Last Synced: 2024-10-28T17:24:05.080Z (about 2 months ago)
Topics: gnu-scientific-library, gsl, levenberg-marquardt, multi-start, nonlinear-least-squares, nonlinear-regression, r, r-package
Language: R
Homepage: https://CRAN.R-project.org/package=gslnls
Size: 1.33 MB
Stars: 15
Watchers: 1
Forks: 2
Open Issues: 0
Metadata Files:
- Readme: README.md
Awesome Lists containing this project

README

        

# {gslnls}: GSL Multi-Start Nonlinear Least-Squares Fitting in R

[![CRAN

version](http://www.r-pkg.org/badges/version/gslnls)](https://cran.r-project.org/package=gslnls)

[![R-CMD-check](https://github.com/JorisChau/gslnls/workflows/R-CMD-check/badge.svg)](https://github.com/JorisChau/gslnls/actions)

[![codecov](https://codecov.io/gh/JorisChau/gslnls/branch/master/graph/badge.svg)](https://app.codecov.io/gh/JorisChau/gslnls)

[![Total

Downloads](https://cranlogs.r-pkg.org/badges/grand-total/gslnls)](https://cran.r-project.org/package=gslnls)

The {gslnls}-package provides R bindings to nonlinear least-squares

optimization with the [GNU Scientific Library

(GSL)](https://www.gnu.org/software/gsl/). The function `gsl_nls()`

solves small to moderate sized nonlinear least-squares problems with the

`gsl_multifit_nlinear` interface with built-in support for multi-start

optimization. For large problems, where factoring the full Jacobian

matrix becomes prohibitively expensive, the `gsl_nls_large()` function

can be used to solve the system with the `gsl_multilarge_nlinear`

interface. The `gsl_nls_large()` function is also appropriate for

systems with sparse structure in the Jacobian matrix.

The following trust region methods to solve nonlinear least-squares

problems are available in `gsl_nls()` (and `gsl_nls_large()`):

- [Levenberg-Marquardt](https://www.gnu.org/software/gsl/doc/html/nls.html#levenberg-marquardt)

- [Levenberg-Marquardt with geodesic

  acceleration](https://www.gnu.org/software/gsl/doc/html/nls.html#levenberg-marquardt-with-geodesic-acceleration)

- [Dogleg](https://www.gnu.org/software/gsl/doc/html/nls.html#dogleg)

- [Double

  dogleg](https://www.gnu.org/software/gsl/doc/html/nls.html#double-dogleg)

- [Two Dimensional

  Subspace](https://www.gnu.org/software/gsl/doc/html/nls.html#two-dimensional-subspace)

- [Steihaug-Toint Conjugate

  Gradient](https://www.gnu.org/software/gsl/doc/html/nls.html#steihaug-toint-conjugate-gradient)

  (only available in `gsl_nls_large()`)

The [Tunable

parameters](https://www.gnu.org/software/gsl/doc/html/nls.html#tunable-parameters)

available for the trust method algorithms can be modified from R in

order to help accelerating convergence for a specific problem at hand.

See the [Nonlinear Least-Squares

Fitting](https://www.gnu.org/software/gsl/doc/html/nls.html#nonlinear-least-squares-fitting)

chapter in the GSL reference manual for a comprehensive overview of the

`gsl_multifit_nlinear` and `gsl_multilarge_nlinear` interfaces and the

relevant mathematical background.

## Installation from source

### System requirements

When installing the R-package from source, verify that GSL (\>= 2.2) is

installed on the system, e.g. on Ubuntu/Debian Linux:

    gsl-config --version

If GSL (\>= 2.2) is not available on the system, install GSL from a

pre-compiled binary package (see the examples below) or install GSL from

source by downloading the latest stable release

() and following the installation

instructions in the included README and INSTALL files.

#### GSL installation examples

##### Ubuntu, Debian

    sudo apt-get install libgsl-dev

##### macOS

    brew install gsl

##### Fedora, RedHat, CentOS

    yum install gsl-devel

##### Windows

A binary version of GSL (2.7) can be installed using the Rtools package

manager (see

e.g. ):

    pacman -S mingw-w64-{i686,x86_64}-gsl

On windows, the environment variable `LIB_GSL` must be set to the parent

of the directory containing `libgsl.a`. Note that forward instead of

backward slashes should be used in the directory path

(e.g. `C:/rtools43/x86_64-w64-mingw32.static.posix`).

### R-package installation

With GSL available, install the R-package from source with:

``` r

## Install latest CRAN release:

install.packages("gslnls", type = "source")

```

or install the latest development version from GitHub with:

``` r

## Install latest GitHub development version:

# install.packages("devtools")

devtools::install_github("JorisChau/gslnls")

```

## Installation from binary

On windows and some macOS builds, the R-package can be installed from

CRAN as a binary package. In this case, GSL does not need to be

available on the system.

``` r

## Install latest CRAN release:

install.packages("gslnls")

```

## Example usage

### Example 1: Exponential model

#### Data

The code below simulates

![n = 25](https://latex.codecogs.com/png.latex?n%20%3D%2025 "n = 25")

noisy observations

![y_1,\ldots,y_n](https://latex.codecogs.com/png.latex?y_1%2C%5Cldots%2Cy_n "y_1,\ldots,y_n")

from an exponential model with additive (i.i.d.) Gaussian noise

according to:

![\left\\

\begin{aligned}

f_i & = A \cdot \exp(-\lambda \cdot x_i) + b, & i = 1,\ldots, n \\

y_i & = f_i + \epsilon_i, & \epsilon_i \overset{\text{iid}}{\sim} N(0,\sigma^2)

\end{aligned}

\right.](https://latex.codecogs.com/png.latex?%5Cleft%5C%7B%0A%5Cbegin%7Baligned%7D%0Af_i%20%26%20%3D%20A%20%5Ccdot%20%5Cexp%28-%5Clambda%20%5Ccdot%20x_i%29%20%2B%20b%2C%20%26%20i%20%3D%201%2C%5Cldots%2C%20n%20%5C%5C%0Ay_i%20%26%20%3D%20f_i%20%2B%20%5Cepsilon_i%2C%20%26%20%5Cepsilon_i%20%5Coverset%7B%5Ctext%7Biid%7D%7D%7B%5Csim%7D%20N%280%2C%5Csigma%5E2%29%0A%5Cend%7Baligned%7D%0A%5Cright. "\left\{

\begin{aligned}

f_i & = A \cdot \exp(-\lambda \cdot x_i) + b, & i = 1,\ldots, n \\

y_i & = f_i + \epsilon_i, & \epsilon_i \overset{\text{iid}}{\sim} N(0,\sigma^2)

\end{aligned}

\right.")

The exponential model parameters are set to

![A = 5](https://latex.codecogs.com/png.latex?A%20%3D%205 "A = 5"),

![\lambda = 1.5](https://latex.codecogs.com/png.latex?%5Clambda%20%3D%201.5 "\lambda = 1.5"),

![b = 1](https://latex.codecogs.com/png.latex?b%20%3D%201 "b = 1"), with

a noise standard deviation of

![\sigma = 0.25](https://latex.codecogs.com/png.latex?%5Csigma%20%3D%200.25 "\sigma = 0.25").

``` r

set.seed(1)

n <- 25

x <- (seq_len(n) - 1) * 3 / (n - 1)

f <- function(A, lam, b, x) A * exp(-lam * x) + b

y <- f(A = 5, lam = 1.5, b = 1, x) + rnorm(n, sd = 0.25)

```



#### Model fit

The exponential model is fitted to the data using the `gsl_nls()`

function by passing the nonlinear model as a two-sided `formula` and

providing starting parameters for the model parameters

![A, \lambda, b](https://latex.codecogs.com/png.latex?A%2C%20%5Clambda%2C%20b "A, \lambda, b")

analogous to an `nls()` function call.

``` r

library(gslnls)

ex1_fit <- gsl_nls(

  fn = y ~ A * exp(-lam * x) + b,    ## model formula

  data = data.frame(x = x, y = y),   ## model fit data

  start = c(A = 0, lam = 0, b = 0)   ## starting values

)

ex1_fit

#> Nonlinear regression model

#>   model: y ~ A * exp(-lam * x) + b

#>    data: data.frame(x = x, y = y)

#>     A   lam     b 

#> 4.893 1.417 1.010 

#>  residual sum-of-squares: 1.316

#> 

#> Algorithm: multifit/levenberg-marquardt, (scaling: more, solver: qr)

#> 

#> Number of iterations to convergence: 9 

#> Achieved convergence tolerance: 0

```

Here, the nonlinear least squares problem has been solved with the

Levenberg-Marquardt algorithm (default) in the `gsl_multifit_nlinear`

interface using the following `control` parameters:

``` r

## default control parameters

gsl_nls_control() |> str()

#> List of 21

#>  $ maxiter        : int 100

#>  $ scale          : chr "more"

#>  $ solver         : chr "qr"

#>  $ fdtype         : chr "forward"

#>  $ factor_up      : num 2

#>  $ factor_down    : num 3

#>  $ avmax          : num 0.75

#>  $ h_df           : num 1.49e-08

#>  $ h_fvv          : num 0.02

#>  $ xtol           : num 1.49e-08

#>  $ ftol           : num 1.49e-08

#>  $ gtol           : num 1.49e-08

#>  $ mstart_n       : int 30

#>  $ mstart_p       : int 5

#>  $ mstart_q       : int 3

#>  $ mstart_r       : num 4

#>  $ mstart_s       : int 2

#>  $ mstart_tol     : num 0.25

#>  $ mstart_maxiter : int 10

#>  $ mstart_maxstart: int 250

#>  $ mstart_minsp   : int 1

```

Run `?gsl_nls_control` or check the [GSL reference

manual](https://www.gnu.org/software/gsl/doc/html/nls.html#tunable-parameters)

for further details on the available tuning parameters to control the

trust region algorithms.

#### Object methods

The fitted model object returned by `gsl_nls()` is of class `"gsl_nls"`,

which inherits from class `"nls"`. For this reason, generic functions

such as `anova`, `coef`, `confint`, `deviance`, `df.residual`, `fitted`,

`formula`, `logLik`, `predict`, `print` `profile`, `residuals`,

`summary`, `vcov` and `weights` are also applicable for models fitted

with `gsl_nls()`.

``` r

## model summary

summary(ex1_fit)

#> 

#> Formula: y ~ A * exp(-lam * x) + b

#> 

#> Parameters:

#>     Estimate Std. Error t value Pr(>|t|)    

#> A     4.8930     0.1811  27.014  < 2e-16 ***

#> lam   1.4169     0.1304  10.865 2.61e-10 ***

#> b     1.0097     0.1092   9.246 4.92e-09 ***

#> ---

#> Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

#> 

#> Residual standard error: 0.2446 on 22 degrees of freedom

#> 

#> Number of iterations to convergence: 9 

#> Achieved convergence tolerance: 0

## asymptotic confidence intervals

confint(ex1_fit)

#>         2.5 %   97.5 %

#> A   4.5173851 5.268653

#> lam 1.1464128 1.687314

#> b   0.7832683 1.236216

```

The `predict` method extends the existing `predict.nls` method by

allowing for calculation of asymptotic confidence and prediction

(tolerance) intervals in addition to prediction of the expected

response:

``` r

## asymptotic prediction intervals

predict(ex1_fit, interval = "prediction", level = 0.95)

#>            fit       lwr      upr

#>  [1,] 5.902761 5.2670162 6.538506

#>  [2,] 5.108572 4.5388041 5.678340

#>  [3,] 4.443289 3.8987833 4.987794

#>  [4,] 3.885988 3.3479065 4.424069

#>  [5,] 3.419142 2.8812430 3.957042

....

```



The new `confintd` method can be used to evaluate asymptotic confidence

intervals of derived (or transformed) parameters based on the delta

method, i.e. a first-order (Taylor) approximation of the function of the

parameters:

``` r

## delta method confidence intervals

confintd(ex1_fit, expr = c("b", "A + b", "log(lam)"), level = 0.95)

#>                fit       lwr       upr

#> b        1.0097419 0.7832683 1.2362155

#> A + b    5.9027612 5.5194278 6.2860945

#> log(lam) 0.3484454 0.1575657 0.5393251

```

#### Jacobian calculation

If the `jac` argument in `gsl_nls()` is undefined, the Jacobian matrix

used to solve the [trust region

subproblem](https://www.gnu.org/software/gsl/doc/html/nls.html#solving-the-trust-region-subproblem-trs)

is approximated by forward (or centered) finite differences. Instead, an

analytic Jacobian can be passed to `jac` by defining a function that

returns the

![(n \times p)](https://latex.codecogs.com/png.latex?%28n%20%5Ctimes%20p%29 "(n \times p)")-dimensional

Jacobian matrix of the nonlinear model `fn`, where the first argument

must be the vector of parameters of length

![p](https://latex.codecogs.com/png.latex?p "p").

In the exponential model example, the Jacobian matrix is a

![(50 \times 3)](https://latex.codecogs.com/png.latex?%2850%20%5Ctimes%203%29 "(50 \times 3)")-dimensional

matrix

![\[\boldsymbol{J}\_{ij}\]\_{ij}](https://latex.codecogs.com/png.latex?%5B%5Cboldsymbol%7BJ%7D_%7Bij%7D%5D_%7Bij%7D "[\boldsymbol{J}_{ij}]_{ij}")

with rows:

![\boldsymbol{J}\_i \\= \\\left\[ \frac{\partial f_i}{\partial A}, \frac{\partial f_i}{\partial \lambda}, \frac{\partial f_i}{\partial b} \right\] \\= \\\left\[ \exp(-\lambda \cdot x_i), -A \cdot \exp(-\lambda \cdot x_i) \cdot x_i, 1 \right\]](https://latex.codecogs.com/png.latex?%5Cboldsymbol%7BJ%7D_i%20%5C%20%3D%20%5C%20%5Cleft%5B%20%5Cfrac%7B%5Cpartial%20f_i%7D%7B%5Cpartial%20A%7D%2C%20%5Cfrac%7B%5Cpartial%20f_i%7D%7B%5Cpartial%20%5Clambda%7D%2C%20%5Cfrac%7B%5Cpartial%20f_i%7D%7B%5Cpartial%20b%7D%20%5Cright%5D%20%5C%20%3D%20%5C%20%5Cleft%5B%20%5Cexp%28-%5Clambda%20%5Ccdot%20x_i%29%2C%20-A%20%5Ccdot%20%5Cexp%28-%5Clambda%20%5Ccdot%20x_i%29%20%5Ccdot%20x_i%2C%201%20%5Cright%5D "\boldsymbol{J}_i \ = \ \left[ \frac{\partial f_i}{\partial A}, \frac{\partial f_i}{\partial \lambda}, \frac{\partial f_i}{\partial b} \right] \ = \ \left[ \exp(-\lambda \cdot x_i), -A \cdot \exp(-\lambda \cdot x_i) \cdot x_i, 1 \right]")

which is encoded in the following call to `gsl_nls()`:

``` r

## analytic Jacobian (1)

gsl_nls(

  fn = y ~ A * exp(-lam * x) + b,    ## model formula

  data = data.frame(x = x, y = y),   ## model fit data

  start = c(A = 0, lam = 0, b = 0),  ## starting values

  jac = function(par, x) with(as.list(par), cbind(A = exp(-lam * x), lam = -A * x * exp(-lam * x), b = 1)),

  x = x                              ## argument passed to jac

)

#> Nonlinear regression model

#>   model: y ~ A * exp(-lam * x) + b

#>    data: data.frame(x = x, y = y)

#>     A   lam     b 

#> 4.893 1.417 1.010 

#>  residual sum-of-squares: 1.316

#> 

#> Algorithm: multifit/levenberg-marquardt, (scaling: more, solver: qr)

#> 

#> Number of iterations to convergence: 9 

#> Achieved convergence tolerance: 6.661e-16

```

If the model formula `fn` can be derived with `stats::deriv()`, then the

analytic Jacobian in `jac` can be computed automatically using symbolic

differentiation and no manual calculations are necessary. To evaluate

`jac` by means of symbolic differentiation, set `jac = TRUE`:

``` r

## analytic Jacobian (2)

gsl_nls(

  fn = y ~ A * exp(-lam * x) + b,    ## model formula

  data = data.frame(x = x, y = y),   ## model fit data

  start = c(A = 0, lam = 0, b = 0),  ## starting values

  jac = TRUE                         ## symbolic derivation

)

#> Nonlinear regression model

#>   model: y ~ A * exp(-lam * x) + b

#>    data: data.frame(x = x, y = y)

#>     A   lam     b 

#> 4.893 1.417 1.010 

#>  residual sum-of-squares: 1.316

#> 

#> Algorithm: multifit/levenberg-marquardt, (scaling: more, solver: qr)

#> 

#> Number of iterations to convergence: 9 

#> Achieved convergence tolerance: 6.661e-16

```

Alternatively, a self-starting nonlinear model (see `?selfStart`) can be

passed to `gsl_nls()`. In this case, the Jacobian matrix is evaluated

from the `"gradient"` attribute of the self-starting model object:

``` r

## self-starting model

ss_fit <- gsl_nls(

  fn =  y ~ SSasymp(x, Asym, R0, lrc),    ## model formula

  data = data.frame(x = x, y = y)         ## model fit data

)

ss_fit

#> Nonlinear regression model

#>   model: y ~ SSasymp(x, Asym, R0, lrc)

#>    data: data.frame(x = x, y = y)

#>   Asym     R0    lrc 

#> 1.0097 5.9028 0.3484 

#>  residual sum-of-squares: 1.316

#> 

#> Algorithm: multifit/levenberg-marquardt, (scaling: more, solver: qr)

#> 

#> Number of iterations to convergence: 1 

#> Achieved convergence tolerance: 1.068e-13

```

The self-starting model `SSasymp()` uses a different model

parameterization (`A = R0 - Asym`, `lam = exp(lrc)`, `b = Asym`), but

the fitted models are equivalent. Also, when using a *self-starting*

model, no starting values need to be provided.

**Remark**: confidence intervals for the coefficients in the original

model parameterization can be evaluated with the `confintd` method:

``` r

## delta method confidence intervals

confintd(ss_fit, expr = c("R0 - Asym", "exp(lrc)", "Asym"), level = 0.95)

#>                fit       lwr      upr

#> R0 - Asym 4.893019 4.5173851 5.268653

#> exp(lrc)  1.416863 1.1464128 1.687314

#> Asym      1.009742 0.7832683 1.236216

```

#### Multi-start optimization

In addition to single-start NLS optimization, the `gsl_nls()` function

has built-in support for multi-start optimization. For multi-start

optimization, instead of a list or vector of fixed start parameters,

pass a `list` or `matrix` of start parameter ranges to the argument

`start`:

``` r

gsl_nls(

  fn = y ~ A * exp(-lam * x) + b,    ## model formula

  data = data.frame(x = x, y = y),   ## model fit data

  start = list(A = c(-100, 100), lam = c(-5, 5), b = c(-10, 10))   ## multi-start

)

#> Nonlinear regression model

#>   model: y ~ A * exp(-lam * x) + b

#>    data: data.frame(x = x, y = y)

#>     A   lam     b 

#> 4.893 1.417 1.010 

#>  residual sum-of-squares: 1.316

#> 

#> Algorithm: multifit/levenberg-marquardt, (scaling: more, solver: qr)

#> 

#> Number of iterations to convergence: 2 

#> Achieved convergence tolerance: 6.106e-14

```

The multi-start procedure is a modified version of the algorithm

described in Hickernell and Yuan (1997), see `?gsl_nls` for additional

details. If `start` contains missing (or infinite) values, the

multi-start algorithm is executed without fixed parameter ranges for the

missing parameters. More precisely, the ranges are initialized to the

unit interval and dynamically increased or decreased in each major

iteration of the multi-start algorithm.

``` r

gsl_nls(

  fn = y ~ A * exp(-lam * x) + b,    ## model formula

  data = data.frame(x = x, y = y),   ## model fit data

  start = list(A = NA, lam = NA, b = NA)   ## dynamic multi-start

)

#> Nonlinear regression model

#>   model: y ~ A * exp(-lam * x) + b

#>    data: data.frame(x = x, y = y)

#>     A   lam     b 

#> 4.893 1.417 1.010 

#>  residual sum-of-squares: 1.316

#> 

#> Algorithm: multifit/levenberg-marquardt, (scaling: more, solver: qr)

#> 

#> Number of iterations to convergence: 3 

#> Achieved convergence tolerance: 0

```

**Remark**: the dynamic multi-start procedure is not expected to always

return a global minimum of the NLS objective. Especially when the

objective function contains many local optima, the multi-start algorithm

may be unable to select parameter ranges that include the global

minimizing solution.

### Example 2: Gaussian function

#### Data

The following code generates

![n = 50](https://latex.codecogs.com/png.latex?n%20%3D%2050 "n = 50")

noisy observations

![y_1,\ldots,y_n](https://latex.codecogs.com/png.latex?y_1%2C%5Cldots%2Cy_n "y_1,\ldots,y_n")

from a Gaussian function with multiplicative independent Gaussian noise

according to the model:

![\left\\

\begin{aligned}

f_i & = a \cdot \exp\left(-\frac{(x_i - b)^2}{2c^2}\right), & i = 1,\ldots, n \\

y_i & = f_i \cdot \epsilon_i, & \epsilon_i \overset{\text{iid}}{\sim} N(1,\sigma^2)

\end{aligned}

\right.](https://latex.codecogs.com/png.latex?%5Cleft%5C%7B%0A%5Cbegin%7Baligned%7D%0Af_i%20%26%20%3D%20a%20%5Ccdot%20%5Cexp%5Cleft%28-%5Cfrac%7B%28x_i%20-%20b%29%5E2%7D%7B2c%5E2%7D%5Cright%29%2C%20%26%20i%20%3D%201%2C%5Cldots%2C%20n%20%5C%5C%0Ay_i%20%26%20%3D%20f_i%20%5Ccdot%20%5Cepsilon_i%2C%20%26%20%5Cepsilon_i%20%5Coverset%7B%5Ctext%7Biid%7D%7D%7B%5Csim%7D%20N%281%2C%5Csigma%5E2%29%0A%5Cend%7Baligned%7D%0A%5Cright. "\left\{

\begin{aligned}

f_i & = a \cdot \exp\left(-\frac{(x_i - b)^2}{2c^2}\right), & i = 1,\ldots, n \\

y_i & = f_i \cdot \epsilon_i, & \epsilon_i \overset{\text{iid}}{\sim} N(1,\sigma^2)

\end{aligned}

\right.")

The parameters of the Gaussian model function are set to

![a = 5](https://latex.codecogs.com/png.latex?a%20%3D%205 "a = 5"),

![b = 0.4](https://latex.codecogs.com/png.latex?b%20%3D%200.4 "b = 0.4"),

![c = 0.15](https://latex.codecogs.com/png.latex?c%20%3D%200.15 "c = 0.15"),

with noise standard deviation

![\sigma = 0.1](https://latex.codecogs.com/png.latex?%5Csigma%20%3D%200.1 "\sigma = 0.1")

(see also

).

``` r

set.seed(1)

n <- 50

x <- seq_len(n) / n

f <- function(a, b, c, x) a * exp(-(x - b)^2 / (2 * c^2))

y <- f(a = 5, b = 0.4, c = 0.15, x) * rnorm(n, mean = 1, sd = 0.1)

```



#### Model fit

Using the default

[Levenberg-Marquardt](https://www.gnu.org/software/gsl/doc/html/nls.html#levenberg-marquardt)

algorithm (without geodesic acceleration), the nonlinear Gaussian model

can be fitted with a call to `gsl_nls()` analogous to the previous

example. Here, the `trace` argument is activated in order to trace the

sum of squared residuals (`ssr`) and parameter estimates (`par`) at each

iteration of the algorithm.

``` r

## Levenberg-Marquardt (default)

ex2a_fit <- gsl_nls(

  fn = y ~ a * exp(-(x - b)^2 / (2 * c^2)), ## model formula

  data = data.frame(x = x, y = y),          ## model fit data

  start = c(a = 1, b = 0, c = 1),           ## starting values

  trace = TRUE                              ## verbose output

)

#> iter   1: ssr = 174.476, par = (2.1711, 3.31169, -4.12254)

#> iter   2: ssr = 171.29, par = (1.85555, 2.84851, -5.66642)

#> iter   3: ssr = 168.703, par = (1.93316, 2.34745, -6.33662)

#> iter   4: ssr = 167.546, par = (1.84665, 1.24131, -7.399)

#> iter   5: ssr = 166.799, par = (1.87948, 0.380488, -7.61723)

#> iter   6: ssr = 165.623, par = (1.90658, -2.00515, -7.19306)

#> iter   7: ssr = 163.944, par = (2.18675, -3.19622, -5.38804)

#> iter   8: ssr = 161.679, par = (2.41824, -3.1487, -5.03149)

#> iter   9: ssr = 159.955, par = (2.67372, -3.18703, -4.20169)

#> iter  10: ssr = 157.391, par = (3.1083, -2.84066, -3.10574)

#> iter  11: ssr = 153.131, par = (3.54614, -2.38071, -2.53824)

#> iter  12: ssr = 150.129, par = (4.02799, -1.97822, -1.96376)

#> iter  13: ssr = 146.735, par = (4.49887, -1.35879, -1.39554)

#> iter  14: ssr = 142.928, par = (3.14455, -0.573017, -1.08117)

#> iter  15: ssr = 124.562, par = (2.14743, 0.465351, -0.443335)

#> iter  16: ssr = 102.183, par = (3.74451, 0.263346, 0.353849)

#> iter  17: ssr = 35.4869, par = (3.49962, 0.437339, 0.18613)

#> iter  18: ssr = 9.24985, par = (4.41273, 0.381446, 0.16023)

#> iter  19: ssr = 3.13669, par = (4.94382, 0.40041, 0.150317)

#> iter  20: ssr = 2.7623, par = (5.11741, 0.397815, 0.14729)

#> iter  21: ssr = 2.75831, par = (5.13786, 0.397886, 0.146865)

#> iter  22: ssr = 2.7583, par = (5.1389, 0.397884, 0.146831)

#> iter  23: ssr = 2.7583, par = (5.13894, 0.397884, 0.146829)

#> iter  24: ssr = 2.7583, par = (5.13894, 0.397884, 0.146829)

#> iter  25: ssr = 2.7583, par = (5.13894, 0.397884, 0.146829)

#> iter  26: ssr = 2.7583, par = (5.13894, 0.397884, 0.146829)

#> *******************

#> summary from method 'multifit/levenberg-marquardt'

#> number of iterations: 26

#> reason for stopping: input domain error

#> initial ssr = 210.146

#> final ssr = 2.7583

#> ssr/dof = 0.0586872

#> ssr achieved tolerance = 8.88178e-16

#> function evaluations: 125

#> jacobian evaluations: 0

#> fvv evaluations: 0

#> status = success

#> *******************

ex2a_fit

#> Nonlinear regression model

#>   model: y ~ a * exp(-(x - b)^2/(2 * c^2))

#>    data: data.frame(x = x, y = y)

#>      a      b      c 

#> 5.1389 0.3979 0.1468 

#>  residual sum-of-squares: 2.758

#> 

#> Algorithm: multifit/levenberg-marquardt, (scaling: more, solver: qr)

#> 

#> Number of iterations to convergence: 26 

#> Achieved convergence tolerance: 8.882e-16

```

#### Geodesic acceleration

The nonlinear model can also be fitted using the Levenberg-Marquardt

algorithm with [geodesic

acceleration](https://www.gnu.org/software/gsl/doc/html/nls.html#levenberg-marquardt-with-geodesic-acceleration)

by changing the default `algorithm = "lm"` to `algorithm = "lmaccel"`.

``` r

## Levenberg-Marquardt w/ geodesic acceleration

ex2b_fit <- gsl_nls(

  fn = y ~ a * exp(-(x - b)^2 / (2 * c^2)), ## model formula

  data = data.frame(x = x, y = y),          ## model fit data

  start = c(a = 1, b = 0, c = 1),           ## starting values

  algorithm = "lmaccel",                    ## algorithm

  trace = TRUE                              ## verbose output

)

#> iter   1: ssr = 158.039, par = (1.58476, 0.502555, 0.511498)

#> iter   2: ssr = 126.469, par = (1.8444, 0.366374, 0.403898)

#> iter   3: ssr = 77.0115, par = (2.5025, 0.392374, 0.272717)

#> iter   4: ssr = 26.4036, par = (3.64063, 0.394564, 0.205619)

#> iter   5: ssr = 5.35492, par = (4.63506, 0.396865, 0.16366)

#> iter   6: ssr = 2.82529, par = (5.05578, 0.397909, 0.149333)

#> iter   7: ssr = 2.75877, par = (5.13246, 0.397896, 0.147057)

#> iter   8: ssr = 2.7583, par = (5.13862, 0.397885, 0.146843)

#> iter   9: ssr = 2.7583, par = (5.13893, 0.397884, 0.14683)

#> iter  10: ssr = 2.7583, par = (5.13894, 0.397884, 0.146829)

#> iter  11: ssr = 2.7583, par = (5.13894, 0.397884, 0.146829)

#> iter  12: ssr = 2.7583, par = (5.13894, 0.397884, 0.146829)

#> *******************

#> summary from method 'multifit/levenberg-marquardt+accel'

#> number of iterations: 12

#> reason for stopping: input domain error

#> initial ssr = 210.146

#> final ssr = 2.7583

#> ssr/dof = 0.0586872

#> ssr achieved tolerance = 3.24185e-14

#> function evaluations: 76

#> jacobian evaluations: 0

#> fvv evaluations: 0

#> status = success

#> *******************

ex2b_fit

#> Nonlinear regression model

#>   model: y ~ a * exp(-(x - b)^2/(2 * c^2))

#>    data: data.frame(x = x, y = y)

#>      a      b      c 

#> 5.1389 0.3979 0.1468 

#>  residual sum-of-squares: 2.758

#> 

#> Algorithm: multifit/levenberg-marquardt+accel, (scaling: more, solver: qr)

#> 

#> Number of iterations to convergence: 12 

#> Achieved convergence tolerance: 3.242e-14

```

With geodesic acceleration enabled the method converges after 12

iterations, whereas the method without geodesic acceleration required 26

iterations. This indicates that the nonlinear least-squares solver

benefits (substantially) from the geodesic acceleration correction.

##### Second directional derivative

By default, if the `fvv` argument is undefined, the second directional

derivative

![D^2_v f](https://latex.codecogs.com/png.latex?D%5E2_v%20f "D^2_v f")

used to calculate the geodesic acceleration correction is approximated

by forward (or centered) finite differences. To use an analytic

expression for

![D^2_v f](https://latex.codecogs.com/png.latex?D%5E2_v%20f "D^2_v f"),

a function returning the

![n](https://latex.codecogs.com/png.latex?n "n")-dimensional vector of

second directional derivatives of the nonlinear model can be passed to

`fvv`. The first argument of the function must be the vector of

parameters of length ![p](https://latex.codecogs.com/png.latex?p "p")

and the second argument must be the velocity vector, also of length

![p](https://latex.codecogs.com/png.latex?p "p").

For the Gaussian model function, the matrix of second partial

derivatives, i.e. the Hessian, is given by

(cf. ):

![\boldsymbol{H}\_{f_i} \\= \\

\left\[\begin{matrix} 

\frac{\partial^2 f_i}{\partial a^2} & \frac{\partial^2 f_i}{\partial a \partial b} & \frac{\partial^2 f_i}{\partial a \partial c} \\

& \frac{\partial^2 f_i}{\partial b^2} & \frac{\partial^2 f_i}{\partial b \partial c} \\

& & \frac{\partial^2 f_i}{\partial c^2}

\end{matrix}\right\] \\= \\

\left\[\begin{matrix}

0 & \frac{z_i}{c} e_i & \frac{z_i^2}{c} e_i \\

& -\frac{a}{c^2} (1 - z_i^2) e_i & -\frac{a}{c^2} z_i (2 - z_i^2) e_i \\

& & -\frac{a}{c^2} z_i^2 (3 - z_i^2) e_i 

\end{matrix}\right\]](https://latex.codecogs.com/png.latex?%5Cboldsymbol%7BH%7D_%7Bf_i%7D%20%5C%20%3D%20%5C%20%0A%5Cleft%5B%5Cbegin%7Bmatrix%7D%20%0A%5Cfrac%7B%5Cpartial%5E2%20f_i%7D%7B%5Cpartial%20a%5E2%7D%20%26%20%5Cfrac%7B%5Cpartial%5E2%20f_i%7D%7B%5Cpartial%20a%20%5Cpartial%20b%7D%20%26%20%5Cfrac%7B%5Cpartial%5E2%20f_i%7D%7B%5Cpartial%20a%20%5Cpartial%20c%7D%20%5C%5C%0A%26%20%5Cfrac%7B%5Cpartial%5E2%20f_i%7D%7B%5Cpartial%20b%5E2%7D%20%26%20%5Cfrac%7B%5Cpartial%5E2%20f_i%7D%7B%5Cpartial%20b%20%5Cpartial%20c%7D%20%5C%5C%0A%26%20%26%20%5Cfrac%7B%5Cpartial%5E2%20f_i%7D%7B%5Cpartial%20c%5E2%7D%0A%5Cend%7Bmatrix%7D%5Cright%5D%20%5C%20%3D%20%5C%20%0A%5Cleft%5B%5Cbegin%7Bmatrix%7D%0A0%20%26%20%5Cfrac%7Bz_i%7D%7Bc%7D%20e_i%20%26%20%5Cfrac%7Bz_i%5E2%7D%7Bc%7D%20e_i%20%5C%5C%0A%26%20-%5Cfrac%7Ba%7D%7Bc%5E2%7D%20%281%20-%20z_i%5E2%29%20e_i%20%26%20-%5Cfrac%7Ba%7D%7Bc%5E2%7D%20z_i%20%282%20-%20z_i%5E2%29%20e_i%20%5C%5C%0A%26%20%26%20-%5Cfrac%7Ba%7D%7Bc%5E2%7D%20z_i%5E2%20%283%20-%20z_i%5E2%29%20e_i%20%0A%5Cend%7Bmatrix%7D%5Cright%5D "\boldsymbol{H}_{f_i} \ = \ 

\left[\begin{matrix} 

\frac{\partial^2 f_i}{\partial a^2} & \frac{\partial^2 f_i}{\partial a \partial b} & \frac{\partial^2 f_i}{\partial a \partial c} \\

& \frac{\partial^2 f_i}{\partial b^2} & \frac{\partial^2 f_i}{\partial b \partial c} \\

& & \frac{\partial^2 f_i}{\partial c^2}

\end{matrix}\right] \ = \ 

\left[\begin{matrix}

0 & \frac{z_i}{c} e_i & \frac{z_i^2}{c} e_i \\

& -\frac{a}{c^2} (1 - z_i^2) e_i & -\frac{a}{c^2} z_i (2 - z_i^2) e_i \\

& & -\frac{a}{c^2} z_i^2 (3 - z_i^2) e_i 

\end{matrix}\right]")

where the lower half of the Hessian matrix is omitted since it is

symmetric and where we use the notation,

![\begin{aligned}

z_i & \\= \\\frac{x_i - b}{c} \\

e_i & \\= \\\exp\left(-\frac{1}{2}z_i^2 \right)

\end{aligned}](https://latex.codecogs.com/png.latex?%5Cbegin%7Baligned%7D%0Az_i%20%26%20%5C%20%3D%20%5C%20%5Cfrac%7Bx_i%20-%20b%7D%7Bc%7D%20%5C%5C%0Ae_i%20%26%20%5C%20%3D%20%5C%20%5Cexp%5Cleft%28-%5Cfrac%7B1%7D%7B2%7Dz_i%5E2%20%5Cright%29%0A%5Cend%7Baligned%7D "\begin{aligned}

z_i & \ = \ \frac{x_i - b}{c} \\

e_i & \ = \ \exp\left(-\frac{1}{2}z_i^2 \right)

\end{aligned}")

Based on the Hessian matrix, the second directional derivative of

![f_i](https://latex.codecogs.com/png.latex?f_i "f_i"), with

![i = 1,\ldots,n](https://latex.codecogs.com/png.latex?i%20%3D%201%2C%5Cldots%2Cn "i = 1,\ldots,n"),

becomes:

![\begin{aligned}

D_v^2 f_i & \\= \\\sum\_{j,k} v\_{\theta_j}v\_{\theta_k} \frac{\partial^2 f_i}{\partial \theta_j \partial \theta_k} \\

& \\= \\v_a^2 \frac{\partial^2 f_i}{\partial a^2} + 2 v_av_b\frac{\partial^2 f_i}{\partial a \partial b} + 2v_av_c\frac{\partial^2 f_i}{\partial a \partial c} + v_b^2\frac{\partial^2 f_i}{\partial b^2} + 2v_bv_c\frac{\partial^2 f_i}{\partial b \partial c} + v_c^2\frac{\partial^2 f_i}{\partial c^2} \\

& \\= \\2v_a v_b\frac{z_i}{c} e_i + 2v_av_c \frac{z_i^2}{c} e_i - v_b^2\frac{a}{c^2} (1 - z_i^2) e_i - 2v_bv_c \frac{a}{c^2} z_i (2 - z_i^2) e_i - v_c^2\frac{a}{c^2} z_i^2 (3 - z_i^2) e_i

\end{aligned}](https://latex.codecogs.com/png.latex?%5Cbegin%7Baligned%7D%0AD_v%5E2%20f_i%20%26%20%5C%20%3D%20%5C%20%5Csum_%7Bj%2Ck%7D%20v_%7B%5Ctheta_j%7Dv_%7B%5Ctheta_k%7D%20%5Cfrac%7B%5Cpartial%5E2%20f_i%7D%7B%5Cpartial%20%5Ctheta_j%20%5Cpartial%20%5Ctheta_k%7D%20%5C%5C%0A%26%20%5C%20%3D%20%5C%20v_a%5E2%20%5Cfrac%7B%5Cpartial%5E2%20f_i%7D%7B%5Cpartial%20a%5E2%7D%20%2B%202%20v_av_b%5Cfrac%7B%5Cpartial%5E2%20f_i%7D%7B%5Cpartial%20a%20%5Cpartial%20b%7D%20%2B%202v_av_c%5Cfrac%7B%5Cpartial%5E2%20f_i%7D%7B%5Cpartial%20a%20%5Cpartial%20c%7D%20%2B%20v_b%5E2%5Cfrac%7B%5Cpartial%5E2%20f_i%7D%7B%5Cpartial%20b%5E2%7D%20%2B%202v_bv_c%5Cfrac%7B%5Cpartial%5E2%20f_i%7D%7B%5Cpartial%20b%20%5Cpartial%20c%7D%20%2B%20v_c%5E2%5Cfrac%7B%5Cpartial%5E2%20f_i%7D%7B%5Cpartial%20c%5E2%7D%20%5C%5C%0A%26%20%5C%20%3D%20%5C%202v_a%20v_b%5Cfrac%7Bz_i%7D%7Bc%7D%20e_i%20%2B%202v_av_c%20%5Cfrac%7Bz_i%5E2%7D%7Bc%7D%20e_i%20-%20v_b%5E2%5Cfrac%7Ba%7D%7Bc%5E2%7D%20%281%20-%20z_i%5E2%29%20e_i%20-%202v_bv_c%20%5Cfrac%7Ba%7D%7Bc%5E2%7D%20z_i%20%282%20-%20z_i%5E2%29%20e_i%20-%20v_c%5E2%5Cfrac%7Ba%7D%7Bc%5E2%7D%20z_i%5E2%20%283%20-%20z_i%5E2%29%20e_i%0A%5Cend%7Baligned%7D "\begin{aligned}

D_v^2 f_i & \ = \ \sum_{j,k} v_{\theta_j}v_{\theta_k} \frac{\partial^2 f_i}{\partial \theta_j \partial \theta_k} \\

& \ = \ v_a^2 \frac{\partial^2 f_i}{\partial a^2} + 2 v_av_b\frac{\partial^2 f_i}{\partial a \partial b} + 2v_av_c\frac{\partial^2 f_i}{\partial a \partial c} + v_b^2\frac{\partial^2 f_i}{\partial b^2} + 2v_bv_c\frac{\partial^2 f_i}{\partial b \partial c} + v_c^2\frac{\partial^2 f_i}{\partial c^2} \\

& \ = \ 2v_a v_b\frac{z_i}{c} e_i + 2v_av_c \frac{z_i^2}{c} e_i - v_b^2\frac{a}{c^2} (1 - z_i^2) e_i - 2v_bv_c \frac{a}{c^2} z_i (2 - z_i^2) e_i - v_c^2\frac{a}{c^2} z_i^2 (3 - z_i^2) e_i

\end{aligned}")

which can be encoded using `gsl_nls()` as follows:

``` r

## second directional derivative

fvv <- function(par, v, x) {

  with(as.list(par), {

    zi <- (x - b) / c

    ei <- exp(-zi^2 / 2)

    2 * v[["a"]] * v[["b"]] * zi / c * ei + 2 * v[["a"]] * v[["c"]] * zi^2 / c * ei - 

      v[["b"]]^2 * a / c^2 * (1 - zi^2) * ei - 2 * v[["b"]] * v[["c"]] * a / c^2 * zi * (2 - zi^2) * ei -

      v[["c"]]^2 * a / c^2 * zi^2 * (3 - zi^2) * ei

  })

}

## analytic fvv (1)

gsl_nls(

  fn = y ~ a * exp(-(x - b)^2 / (2 * c^2)), ## model formula

  data = data.frame(x = x, y = y),          ## model fit data

  start = c(a = 1, b = 0, c = 1),           ## starting values

  algorithm = "lmaccel",                    ## algorithm

  trace = TRUE,                             ## verbose output

  fvv = fvv,                                ## analytic function

  x = x                                     ## argument passed to fvv

)

#> iter   1: ssr = 158.14, par = (1.584, 0.502802, 0.512515)

#> iter   2: ssr = 126.579, par = (1.84317, 0.365984, 0.404378)

#> iter   3: ssr = 77.1032, par = (2.50015, 0.392468, 0.272532)

#> iter   4: ssr = 26.4382, par = (3.63788, 0.394722, 0.205505)

#> iter   5: ssr = 5.36705, par = (4.63344, 0.396903, 0.16368)

#> iter   6: ssr = 2.82578, par = (5.05546, 0.39791, 0.149341)

#> iter   7: ssr = 2.75877, par = (5.13243, 0.397896, 0.147057)

#> iter   8: ssr = 2.7583, par = (5.13862, 0.397885, 0.146843)

#> iter   9: ssr = 2.7583, par = (5.13893, 0.397884, 0.14683)

#> iter  10: ssr = 2.7583, par = (5.13894, 0.397884, 0.146829)

#> iter  11: ssr = 2.7583, par = (5.13894, 0.397884, 0.146829)

#> iter  12: ssr = 2.7583, par = (5.13894, 0.397884, 0.146829)

#> *******************

#> summary from method 'multifit/levenberg-marquardt+accel'

#> number of iterations: 12

#> reason for stopping: input domain error

#> initial ssr = 210.146

#> final ssr = 2.7583

#> ssr/dof = 0.0586872

#> ssr achieved tolerance = 3.28626e-14

#> function evaluations: 58

#> jacobian evaluations: 0

#> fvv evaluations: 18

#> status = success

#> *******************

#> Nonlinear regression model

#>   model: y ~ a * exp(-(x - b)^2/(2 * c^2))

#>    data: data.frame(x = x, y = y)

#>      a      b      c 

#> 5.1389 0.3979 0.1468 

#>  residual sum-of-squares: 2.758

#> 

#> Algorithm: multifit/levenberg-marquardt+accel, (scaling: more, solver: qr)

#> 

#> Number of iterations to convergence: 12 

#> Achieved convergence tolerance: 3.286e-14

```

If the model formula `fn` can be derived with `stats::deriv()`, then the

analytic Hessian and second directional derivatives in `fvv` can be

computed automatically using symbolic differentiation. Analogous to the

`jac` argument, to evaluate `fvv` by means of symbolic differentiation,

set `fvv = TRUE`:

``` r

## analytic fvv (2)

gsl_nls(

  fn = y ~ a * exp(-(x - b)^2 / (2 * c^2)), ## model formula

  data = data.frame(x = x, y = y),          ## model fit data

  start = c(a = 1, b = 0, c = 1),           ## starting values

  algorithm = "lmaccel",                    ## algorithm

  trace = TRUE,                             ## verbose output

  fvv = TRUE                                ## automatic derivation

)

#> iter   1: ssr = 158.14, par = (1.584, 0.502802, 0.512515)

#> iter   2: ssr = 126.579, par = (1.84317, 0.365984, 0.404378)

#> iter   3: ssr = 77.1032, par = (2.50015, 0.392468, 0.272532)

#> iter   4: ssr = 26.4382, par = (3.63788, 0.394722, 0.205505)

#> iter   5: ssr = 5.36705, par = (4.63344, 0.396903, 0.16368)

#> iter   6: ssr = 2.82578, par = (5.05546, 0.39791, 0.149341)

#> iter   7: ssr = 2.75877, par = (5.13243, 0.397896, 0.147057)

#> iter   8: ssr = 2.7583, par = (5.13862, 0.397885, 0.146843)

#> iter   9: ssr = 2.7583, par = (5.13893, 0.397884, 0.14683)

#> iter  10: ssr = 2.7583, par = (5.13894, 0.397884, 0.146829)

#> iter  11: ssr = 2.7583, par = (5.13894, 0.397884, 0.146829)

#> iter  12: ssr = 2.7583, par = (5.13894, 0.397884, 0.146829)

#> *******************

#> summary from method 'multifit/levenberg-marquardt+accel'

#> number of iterations: 12

#> reason for stopping: input domain error

#> initial ssr = 210.146

#> final ssr = 2.7583

#> ssr/dof = 0.0586872

#> ssr achieved tolerance = 2.93099e-14

#> function evaluations: 58

#> jacobian evaluations: 0

#> fvv evaluations: 18

#> status = success

#> *******************

#> Nonlinear regression model

#>   model: y ~ a * exp(-(x - b)^2/(2 * c^2))

#>    data: data.frame(x = x, y = y)

#>      a      b      c 

#> 5.1389 0.3979 0.1468 

#>  residual sum-of-squares: 2.758

#> 

#> Algorithm: multifit/levenberg-marquardt+accel, (scaling: more, solver: qr)

#> 

#> Number of iterations to convergence: 12 

#> Achieved convergence tolerance: 2.931e-14

```

### Example 3: Branin function

As a third example, we compare the available trust region methods by

minimizing the Branin test function, a common optimization test problem.

For the Branin test function, the following bivariate expression is

used:

![\left\\

\begin{aligned}

F(x_1, x_2) & \\= \\f_1(x_1, x_2)^2 + f_2(x_1, x_2)^2 \\

f_1(x_1, x_2) & \\= \\x_2 + a_1 x_1^2 + a_2 x_1 + a_3 \\

f_2(x_1, x_2) & \\= \\\sqrt{a_4 \cdot (1 + (1 - a_5) \cos(x_1))}

\end{aligned}

\right.](https://latex.codecogs.com/png.latex?%5Cleft%5C%7B%0A%5Cbegin%7Baligned%7D%0AF%28x_1%2C%20x_2%29%20%26%20%5C%20%3D%20%5C%20f_1%28x_1%2C%20x_2%29%5E2%20%2B%20f_2%28x_1%2C%20x_2%29%5E2%20%5C%5C%0Af_1%28x_1%2C%20x_2%29%20%26%20%5C%20%3D%20%5C%20x_2%20%2B%20a_1%20x_1%5E2%20%2B%20a_2%20x_1%20%2B%20a_3%20%5C%5C%0Af_2%28x_1%2C%20x_2%29%20%26%20%5C%20%3D%20%5C%20%5Csqrt%7Ba_4%20%5Ccdot%20%281%20%2B%20%281%20-%20a_5%29%20%5Ccos%28x_1%29%29%7D%0A%5Cend%7Baligned%7D%0A%5Cright. "\left\{

\begin{aligned}

F(x_1, x_2) & \ = \ f_1(x_1, x_2)^2 + f_2(x_1, x_2)^2 \\

f_1(x_1, x_2) & \ = \ x_2 + a_1 x_1^2 + a_2 x_1 + a_3 \\

f_2(x_1, x_2) & \ = \ \sqrt{a_4 \cdot (1 + (1 - a_5) \cos(x_1))}

\end{aligned}

\right.")

with known constants

![a_1 = -5.1/(4 \pi^2)](https://latex.codecogs.com/png.latex?a_1%20%3D%20-5.1%2F%284%20%5Cpi%5E2%29 "a_1 = -5.1/(4 \pi^2)"),

![a_2 = 5/\pi](https://latex.codecogs.com/png.latex?a_2%20%3D%205%2F%5Cpi "a_2 = 5/\pi"),

![a_3 = -6](https://latex.codecogs.com/png.latex?a_3%20%3D%20-6 "a_3 = -6"),

![a_4 = 10](https://latex.codecogs.com/png.latex?a_4%20%3D%2010 "a_4 = 10"),

![a_5 = 1 / (8\pi)](https://latex.codecogs.com/png.latex?a_5%20%3D%201%20%2F%20%288%5Cpi%29 "a_5 = 1 / (8\pi)")

(cf. ),

such that

![F(x_1, x_2)](https://latex.codecogs.com/png.latex?F%28x_1%2C%20x_2%29 "F(x_1, x_2)")

has three local minima in the range

![(x_1, x_2) \in \[-5, 15\] \times \[-5, 15\]](https://latex.codecogs.com/png.latex?%28x_1%2C%20x_2%29%20%5Cin%20%5B-5%2C%2015%5D%20%5Ctimes%20%5B-5%2C%2015%5D "(x_1, x_2) \in [-5, 15] \times [-5, 15]").

The minimization problem can be solved with `gsl_nls()` by considering

the cost function

![F(x_1, x_2)](https://latex.codecogs.com/png.latex?F%28x_1%2C%20x_2%29 "F(x_1, x_2)")

as a sum of squared residuals in which

![f_1(x_1, x_2)](https://latex.codecogs.com/png.latex?f_1%28x_1%2C%20x_2%29 "f_1(x_1, x_2)")

and

![f_2(x_1, x_2)](https://latex.codecogs.com/png.latex?f_2%28x_1%2C%20x_2%29 "f_2(x_1, x_2)")

are two residuals relative to two zero responses. Here, instead of

passing a `formula` to `gsl_nls()`, the nonlinear model is passed

directly as a `function`. The first parameter of the function is the

vector of parameters,

i.e. ![(x_1, x_2)](https://latex.codecogs.com/png.latex?%28x_1%2C%20x_2%29 "(x_1, x_2)"),

and the function returns the vector of model evaluations,

i.e. ![(f_1(x_1, x_2), f_2(x_1, x_2))](https://latex.codecogs.com/png.latex?%28f_1%28x_1%2C%20x_2%29%2C%20f_2%28x_1%2C%20x_2%29%29 "(f_1(x_1, x_2), f_2(x_1, x_2))").

When passing a `function` instead of `formula` to `gsl_nls()`, the

vector of observed responses should be included in the `y` argument. In

this example, `y` is set to a vector of zeros. As starting values, we

use

![x_1 = 6](https://latex.codecogs.com/png.latex?x_1%20%3D%206 "x_1 = 6")

and

![x_2 = 14.5](https://latex.codecogs.com/png.latex?x_2%20%3D%2014.5 "x_2 = 14.5")

equivalent to the example in the GSL reference manual.

``` r

## Branin model function

branin <- function(x) {

  a <- c(-5.1 / (4 * pi^2), 5 / pi, -6, 10, 1 / (8 * pi))

  f1 <- x[2] + a[1] * x[1]^2 + a[2] * x[1] + a[3] 

  f2 <- sqrt(a[4] * (1 + (1 - a[5]) * cos(x[1])))

  c(f1, f2)

}

## Levenberg-Marquardt minimization

ex3_fit <- gsl_nls(

  fn = branin,                   ## model function      

  y = c(0, 0),                   ## response vector 

  start = c(x1 = 6, x2 = 14.5),  ## starting values

  algorithm = "lm"               ## algorithm

)

ex3_fit

#> Nonlinear regression model

#>   model: y ~ fn(x)

#>     x1     x2 

#> -3.142 12.275 

#>  residual sum-of-squares: 0.3979

#> 

#> Algorithm: multifit/levenberg-marquardt, (scaling: more, solver: qr)

#> 

#> Number of iterations to convergence: 20 

#> Achieved convergence tolerance: 0

```

**Note**: When using the `function` method of `gsl_nls()`, the returned

object no longer has class `"nls"`,

``` r

class(ex3_fit)

#> [1] "gsl_nls"

```

However, all generics `anova`, `coef`, `confint`, `deviance`,

`df.residual`, `fitted`, `formula`, `logLik`, `predict`, `print`,

`residuals`, `summary`, `vcov` and `weights` remain applicable to

objects (only) of class `"gls_nls"`.

#### Method comparisons

Solving the same minimization problem with all available trust region

methods, i.e. `algorithm` set to `"lm"`, `"lmaccel"`, `"dogleg"`,

`"ddogleg"` and `"subspace2D"` respectively, and tracing the parameter

values at each iteration, we can visualize the minimization paths

followed by each method.



Analogous to the

[example](https://www.gnu.org/software/gsl/doc/html/nls.html#comparing-trs-methods-example)

in the GSL reference manual, the standard Levenberg-Marquardt method

without geodesic acceleration converges to the minimum at

![(-\pi, 12.275)](https://latex.codecogs.com/png.latex?%28-%5Cpi%2C%2012.275%29 "(-\pi, 12.275)"),

all other methods converge to the minimum at

![(\pi, 2.275)](https://latex.codecogs.com/png.latex?%28%5Cpi%2C%202.275%29 "(\pi, 2.275)").

### Example 4: Large NLS example

To illustrate the use of `gsl_nls_large()`, we reproduce the large

nonlinear least-squares example

()

from the GSL reference manual. The nonlinear least squares model is

defined as:

![\left\\

\begin{aligned}

f_i & \\= \sqrt{\alpha}(\theta_i + 1), \quad i = 1,\ldots,p \\

f\_{p + 1} & \\= \Vert \boldsymbol{\theta} \Vert^2  - \frac{1}{4}

\end{aligned}

\right.](https://latex.codecogs.com/png.latex?%5Cleft%5C%7B%0A%5Cbegin%7Baligned%7D%0Af_i%20%26%20%5C%20%3D%20%5Csqrt%7B%5Calpha%7D%28%5Ctheta_i%20%2B%201%29%2C%20%5Cquad%20i%20%3D%201%2C%5Cldots%2Cp%20%5C%5C%0Af_%7Bp%20%2B%201%7D%20%26%20%5C%20%3D%20%5CVert%20%5Cboldsymbol%7B%5Ctheta%7D%20%5CVert%5E2%20%20-%20%5Cfrac%7B1%7D%7B4%7D%0A%5Cend%7Baligned%7D%0A%5Cright. "\left\{

\begin{aligned}

f_i & \ = \sqrt{\alpha}(\theta_i + 1), \quad i = 1,\ldots,p \\

f_{p + 1} & \ = \Vert \boldsymbol{\theta} \Vert^2  - \frac{1}{4}

\end{aligned}

\right.")

with given constant

![\alpha = 10^{-5}](https://latex.codecogs.com/png.latex?%5Calpha%20%3D%2010%5E%7B-5%7D "\alpha = 10^{-5}")

and unknown parameters

![\theta_1,\ldots, \theta_p](https://latex.codecogs.com/png.latex?%5Ctheta_1%2C%5Cldots%2C%20%5Ctheta_p "\theta_1,\ldots, \theta_p").

The residual

![f\_{p + 1}](https://latex.codecogs.com/png.latex?f_%7Bp%20%2B%201%7D "f_{p + 1}")

adds an

![L_2](https://latex.codecogs.com/png.latex?L_2 "L_2")-regularization

constraint on the parameter vector and makes the model nonlinear. The

![(p + 1) \times p](https://latex.codecogs.com/png.latex?%28p%20%2B%201%29%20%5Ctimes%20p "(p + 1) \times p")-dimensional

Jacobian matrix is given by:

![\boldsymbol{J}(\boldsymbol{\theta}) \\= \\

\left\[ \begin{matrix} 

\frac{\partial f_1}{\partial \theta_1} & \ldots & \frac{\partial f_1}{\partial \theta_p} \\

\vdots & \ddots & \vdots \\

\frac{\partial f\_{p+1}}{\partial \theta_1} & \ldots & \frac{\partial f\_{p+1}}{\partial \theta_p}

\end{matrix} \right\] \\= 

\left\[ \begin{matrix}

\sqrt{\alpha} \boldsymbol{I}\_{p \times p} \\

2 \boldsymbol{\theta}'

\end{matrix} \right\]](https://latex.codecogs.com/png.latex?%5Cboldsymbol%7BJ%7D%28%5Cboldsymbol%7B%5Ctheta%7D%29%20%5C%20%3D%20%5C%20%0A%5Cleft%5B%20%5Cbegin%7Bmatrix%7D%20%0A%5Cfrac%7B%5Cpartial%20f_1%7D%7B%5Cpartial%20%5Ctheta_1%7D%20%26%20%5Cldots%20%26%20%5Cfrac%7B%5Cpartial%20f_1%7D%7B%5Cpartial%20%5Ctheta_p%7D%20%5C%5C%0A%5Cvdots%20%26%20%5Cddots%20%26%20%5Cvdots%20%5C%5C%0A%5Cfrac%7B%5Cpartial%20f_%7Bp%2B1%7D%7D%7B%5Cpartial%20%5Ctheta_1%7D%20%26%20%5Cldots%20%26%20%5Cfrac%7B%5Cpartial%20f_%7Bp%2B1%7D%7D%7B%5Cpartial%20%5Ctheta_p%7D%0A%5Cend%7Bmatrix%7D%20%5Cright%5D%20%5C%20%3D%20%0A%5Cleft%5B%20%5Cbegin%7Bmatrix%7D%0A%5Csqrt%7B%5Calpha%7D%20%5Cboldsymbol%7BI%7D_%7Bp%20%5Ctimes%20p%7D%20%5C%5C%0A2%20%5Cboldsymbol%7B%5Ctheta%7D%27%0A%5Cend%7Bmatrix%7D%20%5Cright%5D "\boldsymbol{J}(\boldsymbol{\theta}) \ = \ 

\left[ \begin{matrix} 

\frac{\partial f_1}{\partial \theta_1} & \ldots & \frac{\partial f_1}{\partial \theta_p} \\

\vdots & \ddots & \vdots \\

\frac{\partial f_{p+1}}{\partial \theta_1} & \ldots & \frac{\partial f_{p+1}}{\partial \theta_p}

\end{matrix} \right] \ = 

\left[ \begin{matrix}

\sqrt{\alpha} \boldsymbol{I}_{p \times p} \\

2 \boldsymbol{\theta}'

\end{matrix} \right]")

with

![\boldsymbol{I}\_{p \times p}](https://latex.codecogs.com/png.latex?%5Cboldsymbol%7BI%7D_%7Bp%20%5Ctimes%20p%7D "\boldsymbol{I}_{p \times p}")

the

![(p \times p)](https://latex.codecogs.com/png.latex?%28p%20%5Ctimes%20p%29 "(p \times p)")-dimensional

identity matrix.

The model residuals and Jacobian matrix can be written as a function of

the parameter vector

![\boldsymbol{\theta}](https://latex.codecogs.com/png.latex?%5Cboldsymbol%7B%5Ctheta%7D "\boldsymbol{\theta}")

as,

``` r

## model and jacobian

f <- function(theta) {

  val <- c(sqrt(1e-5) * (theta - 1), sum(theta^2) - 0.25)

  attr(val, "gradient") <- rbind(diag(sqrt(1e-5), nrow = length(theta)), 2 * t(theta))

  return(val)

}

```

Here, the Jacobian is returned in the `"gradient"` attribute of the

evaluated vector (as in a `selfStart` model) from which it is detected

automatically by `gsl_nls()` or `gsl_nls_large()`.

First, the least squares objective is minimized with a call to

`gsl_nls()` analogous to the previous example by passing the nonlinear

model as a `function` and setting the response vector `y` to a vector of

zeros. The number of parameters is set to

![p = 500](https://latex.codecogs.com/png.latex?p%20%3D%20500 "p = 500")

and as starting values we use

![\theta_1 = 1, \ldots, \theta_p = p](https://latex.codecogs.com/png.latex?%5Ctheta_1%20%3D%201%2C%20%5Cldots%2C%20%5Ctheta_p%20%3D%20p "\theta_1 = 1, \ldots, \theta_p = p")

equivalent to the example in the GSL reference manual.

``` r

## number of parameters

p <- 500

## standard Levenberg-Marquardt

system.time({ 

  ex4_fit_lm <- gsl_nls(

    fn = f,

    y = rep(0, p + 1),

    start = 1:p,

    control = list(maxiter = 500)

  )

})

#>    user  system elapsed 

#>  33.720   0.084  33.812

cat("Residual sum-of-squares:", deviance(ex4_fit_lm), "\n")

#> Residual sum-of-squares: 0.004778845

```

Second, the same model is fitted with a call to `gsl_nls_large()` using

the Steihaug-Toint Conjugate Gradient algorithm, which results in a much

smaller runtime:

``` r

## large-scale Steihaug-Toint 

system.time({ 

  ex4_fit_cgst <- gsl_nls_large(

    fn = f,

    y = rep(0, p + 1),

    start = 1:p,

    algorithm = "cgst",

    control = list(maxiter = 500)

  )

})

#>    user  system elapsed 

#>   1.214   0.312   1.527

cat("Residual sum-of-squares:", deviance(ex4_fit_cgst), "\n")

#> Residual sum-of-squares: 0.004778845

```

#### Sparse Jacobian matrix

The Jacobian matrix

![\boldsymbol{J}(\boldsymbol{\theta})](https://latex.codecogs.com/png.latex?%5Cboldsymbol%7BJ%7D%28%5Cboldsymbol%7B%5Ctheta%7D%29 "\boldsymbol{J}(\boldsymbol{\theta})")

is very sparse in the sense that it contains only a small number of

nonzero entries. The `gsl_nls_large()` function also accepts the

calculated Jacobian as a sparse matrix of class `"dgCMatrix"`,

`"dgRMatrix"` or `"dgTMatrix"` (see the

[Matrix](https://cran.r-project.org/web/packages/Matrix/Matrix.pdf)

package). The following updated model function returns the sparse

Jacobian as a `"dgCMatrix"` instead of a dense numeric matrix:

``` r

## model and sparse Jacobian

fsp <- function(theta) {

  val <- c(sqrt(1e-5) * (theta - 1), sum(theta^2) - 0.25)

  attr(val, "gradient") <- rbind(Matrix::Diagonal(x = sqrt(1e-5), n = length(theta)), 2 * t(theta))

  return(val)

}

```

As illustrated by the benchmarks below, besides a slight improvement in

runtimes, the required amount of memory is significantly smaller for the

model functions returning a sparse Jacobian than the model functions

returning a dense Jacobian:

``` r

## computation times and allocated memory

bench::mark(

  "Dense LM" = gsl_nls_large(fn = f, y = rep(0, p + 1), start = 1:p, algorithm = "lm", control = list(maxiter = 500)),

  "Dense CGST" = gsl_nls_large(fn = f, y = rep(0, p + 1), start = 1:p, algorithm = "cgst"),

  "Sparse LM" = gsl_nls_large(fn = fsp, y = rep(0, p + 1), start = 1:p, algorithm = "lm", control = list(maxiter = 500)),

  "Sparse CGST" = gsl_nls_large(fn = fsp, y = rep(0, p + 1), start = 1:p, algorithm = "cgst"),

  check = FALSE,

  min_iterations = 5

)

#> # A tibble: 4 × 6

#>   expression       min   median `itr/sec` mem_alloc `gc/sec`

#>                

#> 1 Dense LM       6.33s    6.39s     0.157     1.8GB    6.61  

#> 2 Dense CGST     1.29s    1.36s     0.726    1.02GB    16.4  

#> 3 Sparse LM      4.87s    4.91s     0.203   33.19MB    0.122

#> 4 Sparse CGST 166.09ms 176.09ms     4.66    23.04MB    3.73

```

## NLS test problems

The R-package currently contains a collection of 59 NLS test problems

originating primarily from the [NIST Statistical Reference Datasets

(StRD)](https://www.itl.nist.gov/div898/strd/nls/nls_main.shtml)

archive; Bates and Watts (1988); and Moré, Garbow, and Hillstrom (1981).

``` r

## avalable test problems

nls_test_list()

#>                                     name    class  p   n       check

#> 1                                Misra1a  formula  2  14  p, n fixed

#> 2                               Chwirut2  formula  3  54  p, n fixed

#> 3                               Chwirut1  formula  3 214  p, n fixed

#> 4                               Lanczos3  formula  6  24  p, n fixed

#> 5                                 Gauss1  formula  8 250  p, n fixed

#> 6                                 Gauss2  formula  8 250  p, n fixed

#> 7                                DanWood  formula  2   6  p, n fixed

#> 8                                Misra1b  formula  2  14  p, n fixed

#> 9                                 Kirby2  formula  5 151  p, n fixed

#> 10                                 Hahn1  formula  7 236  p, n fixed

#> 11                                Nelson  formula  3 128  p, n fixed

#> 12                                 MGH17  formula  5  33  p, n fixed

#> 13                              Lanczos1  formula  6  24  p, n fixed

#> 14                              Lanczos2  formula  6  24  p, n fixed

#> 15                                Gauss3  formula  8 250  p, n fixed

#> 16                               Misra1c  formula  2  14  p, n fixed

#> 17                               Misra1d  formula  2  14  p, n fixed

#> 18                              Roszman1  formula  4  25  p, n fixed

#> 19                                  ENSO  formula  9 168  p, n fixed

#> 20                                 MGH09  formula  4  11  p, n fixed

#> 21                               Thurber  formula  7  37  p, n fixed

#> 22                                BoxBOD  formula  2   6  p, n fixed

#> 23                            Ratkowsky2  formula  3   9  p, n fixed

#> 24                                 MGH10  formula  3  16  p, n fixed

#> 25                              Eckerle4  formula  3  35  p, n fixed

#> 26                            Ratkowsky3  formula  4  15  p, n fixed

#> 27                              Bennett5  formula  3 154  p, n fixed

#> 28                         Isomerization  formula  4  24  p, n fixed

#> 29                             Lubricant  formula  9  53  p, n fixed

#> 30                         Sulfisoxazole  formula  4  12  p, n fixed

#> 31                                Leaves  formula  4  15  p, n fixed

#> 32                              Chloride  formula  3  54  p, n fixed

#> 33                          Tetracycline  formula  4   9  p, n fixed

#> 34                     Linear, full rank function  5  10 p <= n free

#> 35                        Linear, rank 1 function  5  10 p <= n free

#> 36 Linear, rank 1, zero columns and rows function  5  10 p <= n free

#> 37                            Rosenbrock function  2   2  p, n fixed

#> 38                        Helical valley function  3   3  p, n fixed

#> 39                       Powell singular function  4   4  p, n fixed

#> 40                     Freudenstein/Roth function  2   2  p, n fixed

#> 41                                  Bard function  3  15  p, n fixed

#> 42                   Kowalik and Osborne function  4  11  p, n fixed

#> 43                                 Meyer function  3  16  p, n fixed

#> 44                                Watson function  6  31  p, n fixed

#> 45                     Box 3-dimensional function  3  10 p <= n free

#> 46                  Jennrich and Sampson function  2  10 p <= n free

#> 47                      Brown and Dennis function  4  20 p <= n free

#> 48                             Chebyquad function  9   9 p <= n free

#> 49                   Brown almost-linear function 10  10 p == n free

#> 50                             Osborne 1 function  5  33  p, n fixed

#> 51                             Osborne 2 function 11  65  p, n fixed

#> 52                              Hanson 1 function  2  16  p, n fixed

#> 53                              Hanson 2 function  3  16  p, n fixed

#> 54                             McKeown 1 function  2   3  p, n fixed

#> 55                             McKeown 2 function  3   4  p, n fixed

#> 56                             McKeown 3 function  5  10  p, n fixed

#> 57              Devilliers and Glasser 1 function  4  24  p, n fixed

#> 58              Devilliers and Glasser 2 function  5  16  p, n fixed

#> 59                        Madsen example function  2   3  p, n fixed

```

The function `nls_test_problem()` fetches the model definition and model

data required to solve a specific NLS test problem with `gsl_nls()` (or

`nls()` if the model is defined as a `formula`). This also returns the

vector of certified target values corresponding to the *best-available*

solutions and a vector of suggested starting values for the parameters:

``` r

## example regression problem

(ratkowsky2 <- nls_test_problem(name = "Ratkowsky2"))

#> $data

#>       y  x

#> 1  8.93  9

#> 2 10.80 14

#> 3 18.59 21

#> 4 22.33 28

#> 5 39.35 42

#> 6 56.11 57

#> 7 61.73 63

#> 8 64.62 70

#> 9 67.08 79

#> 

#> $fn

#> y ~ b1/(1 + exp(b2 - b3 * x))

#> 

#> 

#> $start

#>    b1    b2    b3 

#> 100.0   1.0   0.1 

#> 

#> $target

#>         b1         b2         b3 

#> 72.4622376  2.6180768  0.0673592 

#> 

#> attr(,"class")

#> [1] "nls_test_formula"

with(ratkowsky2,

     gsl_nls(

       fn = fn,

       data = data,

       start = start

     )

)

#> Nonlinear regression model

#>   model: y ~ b1/(1 + exp(b2 - b3 * x))

#>    data: data

#>       b1       b2       b3 

#> 72.46224  2.61808  0.06736 

#>  residual sum-of-squares: 8.057

#> 

#> Algorithm: multifit/levenberg-marquardt, (scaling: more, solver: qr)

#> 

#> Number of iterations to convergence: 10 

#> Achieved convergence tolerance: 4.619e-14

## example optimization problem

madsen <- nls_test_problem(name = "Madsen")

with(madsen,

     gsl_nls(

       fn = fn,

       y = y,

       start = start,

       jac = jac

     )

)

#> Nonlinear regression model

#>   model: y ~ fn(x)

#>      x1      x2 

#> -0.1554  0.6946 

#>  residual sum-of-squares: 0.7732

#> 

#> Algorithm: multifit/levenberg-marquardt, (scaling: more, solver: qr)

#> 

#> Number of iterations to convergence: 42 

#> Achieved convergence tolerance: 1.11e-16

```

## Other R-packages

Other CRAN R-packages interfacing with GSL that served as inspiration

for this package include:

- [RcppGSL](https://cran.r-project.org/web/packages/RcppGSL/index.html)

  by Dirk Eddelbuettel and Romain Francois

- [GSL](https://cran.r-project.org/web/packages/gsl/index.html) by Robin

  Hankin and others

- [RcppZiggurat](https://cran.r-project.org/web/packages/RcppZiggurat/index.html)

  by Dirk Eddelbuettel

# References





Bates, D.M., and D.G. Watts. 1988. *Nonlinear Regression Analysis and

Its Applications ISBN 0471816434*.





Galassi, M., J. Davies, J. Theiler, B. Gough, G. Jungman, M. Booth, and

F. Rossi. 2009. *GNU Scientific Library Reference Manual (3rd Ed.), ISBN

0954612078*. .





Hickernell, F.J., and Y. Yuan. 1997. “A Simple Multistart Algorithm for

Global Optimization.” *OR Transactions* 1(2).





Moré, J.J., B.S. Garbow, and K.E. Hillstrom. 1981. “Testing

Unconstrained Optimization Software.” *ACM Transactions on Mathematical

Software (TOMS)* 7(1).