An open API service indexing awesome lists of open source software.

https://github.com/queelius/flexhaz

Flexible hazard rate distributions for survival analysis and reliability engineering in R
https://github.com/queelius/flexhaz

automatic-differentiation censored-data failure-rate hazard-function likelihood-functions mle r-package reliability statistical-inference survival-analysis

Last synced: 2 months ago
JSON representation

Flexible hazard rate distributions for survival analysis and reliability engineering in R

Awesome Lists containing this project

README

          

---
output:
github_document:
toc: true
---

```{r, include = FALSE}
knitr::opts_chunk$set(
collapse = TRUE,
comment = "#>",
fig.path = "man/figures/README-",
out.width = "100%"
)
```

# flexhaz

[![CRAN status](https://www.r-pkg.org/badges/version/flexhaz)](https://CRAN.R-project.org/package=flexhaz)
[![R-CMD-check](https://github.com/queelius/flexhaz/actions/workflows/R-CMD-check.yaml/badge.svg)](https://github.com/queelius/flexhaz/actions/workflows/R-CMD-check.yaml)

**Dynamic Failure Rate Distributions for Survival Analysis**

Capacitors that wear out faster than any Weibull can describe. Software systems
with bathtub-shaped crash rates. Post-surgical patients whose risk drops sharply,
then slowly climbs again. Standard parametric survival families cannot express
these hazard patterns — but `flexhaz` can.

**Write the hazard function you need — any R function of time and parameters —
and the package derives everything else**: survival curves, CDFs, densities,
quantiles, sampling, log-likelihoods, MLE fitting, and residual diagnostics.

## Why flexhaz?

| Feature | flexhaz | survival | flexsurv |
|---------|----------|----------|----------|
| Custom hazard functions | **Yes** | No | Limited |
| Built-in distributions | Exp, Weibull, Gompertz, Log-logistic | Weibull, Exp | Many |
| User-supplied derivatives | **score + Hessian** | No | No |
| Censoring support | Right + Left | Right | Right |
| Model diagnostics | Cox-Snell, Martingale, Q-Q | Limited | Limited |
| Likelihood model interface | **Full** | Basic | Partial |

## Features

- **Flexible hazard specification**: Define any hazard function h(t, par, ...)
- **Built-in distributions**: Exponential, Weibull, Gompertz, Log-logistic with optimized implementations
- **Complete distribution interface**: hazard, survival, CDF, PDF, quantiles, sampling
- **Likelihood model support**: Log-likelihood, score, Hessian for MLE
- **Custom derivatives**: Supply analytical score and Hessian functions, or let the package fall back to numerical differentiation via numDeriv
- **Model diagnostics**: Residuals (Cox-Snell, Martingale) and Q-Q plots
- **Censoring support**: Handle exact, right-censored, and left-censored survival data
- **Ecosystem integration**: Works with `algebraic.dist`, `likelihood.model`, `algebraic.mle`

## Installation

Install from CRAN:

```r
install.packages("flexhaz")
```

Or the development version from r-universe:

```r
install.packages("flexhaz", repos = "https://queelius.r-universe.dev")
```

## Quick Start

```{r setup, message=FALSE, warning=FALSE}
library(flexhaz)
```

### Built-in Distributions

Use the convenient constructors for classic survival distributions:

```{r builtin}
# Exponential: constant hazard (memoryless)
exp_dist <- dfr_exponential(lambda = 0.5)

# Weibull: power-law hazard (wear-out or infant mortality)
weib_dist <- dfr_weibull(shape = 2, scale = 3)

# Gompertz: exponentially increasing hazard (aging)
gomp_dist <- dfr_gompertz(a = 0.01, b = 0.1)

# Log-logistic: non-monotonic hazard (increases then decreases)
ll_dist <- dfr_loglogistic(alpha = 10, beta = 2)
```

All distribution functions are automatically available:

```{r methods}
S <- surv(exp_dist)
S(2) # Survival probability at t=2

h <- hazard(weib_dist)
h(1) # Hazard at t=1
```

### Maximum Likelihood Estimation

```{r mle}
# Simulate failure times
set.seed(42)
times <- rexp(50, rate = 1)
df <- data.frame(t = times, delta = 1)

# Fit via MLE
solver <- fit(dfr_exponential())
result <- solver(df, par = c(0.5), method = "BFGS")
coef(result) # Estimated rate
```

### Custom Hazard Functions

Model complex failure patterns like bathtub curves:

```{r bathtub, fig.height=4}
# h(t) = a*exp(-b*t) + c + d*t^k
# Infant mortality + useful life + wear-out
bathtub <- dfr_dist(
rate = function(t, par, ...) {
par[1] * exp(-par[2] * t) + par[3] + par[4] * t^par[5]
},
par = c(a = 1, b = 2, c = 0.02, d = 0.001, k = 2)
)

h <- hazard(bathtub)
curve(sapply(x, h), 0, 15, xlab = "Time", ylab = "Hazard rate",
main = "Bathtub hazard curve")
```

### Model Diagnostics

Check model fit with residual analysis:

```{r diagnostics}
# Fit exponential to data
fitted_exp <- dfr_exponential(lambda = coef(result))

# Cox-Snell residuals Q-Q plot
qqplot_residuals(fitted_exp, df)
```

## Mathematical Background

For a lifetime $T$, the hazard function is:
$$h(t) = \frac{f(t)}{S(t)}$$

From the hazard, all other quantities follow:

| Function | Formula | Method |
|----------|---------|--------|
| Cumulative hazard | $H(t) = \int_0^t h(u) du$ | `cum_haz()` |
| Survival | $S(t) = e^{-H(t)}$ | `surv()` |
| CDF | $F(t) = 1 - S(t)$ | `cdf()` |
| PDF | $f(t) = h(t) \cdot S(t)$ | `density()` |

## Likelihood for Survival Data

For exact observations: $\log L = \log h(t) - H(t)$

For right-censored: $\log L = -H(t)$

```{r likelihood}
# Mixed data with censoring
df <- data.frame(
t = c(1, 2, 3, 4, 5),
delta = c(1, 1, 0, 1, 0) # 1 = exact, 0 = censored
)

ll <- loglik(dfr_exponential())
ll(df, par = c(0.5))
```

## Documentation

**Start Here:**

- [Package Overview & Quick Start](https://queelius.github.io/flexhaz/articles/flexhaz-package.html) - Motivation, complete example, and quick start guide

**Real-World Applications:**

- [Reliability Engineering](https://queelius.github.io/flexhaz/articles/reliability_engineering.html) - Five case studies

**Going Deeper:**

- [Dynamic Failure Rate Distributions](https://queelius.github.io/flexhaz/articles/failure_rate.html) - Mathematical foundations
- [Creating Custom Distributions](https://queelius.github.io/flexhaz/articles/custom_distributions.html) - The three-level optimization paradigm
- [Custom Derivatives for MLE](https://queelius.github.io/flexhaz/articles/custom_derivatives.html) - Analytical score and Hessian functions

**Reference:**

- [Function Reference](https://queelius.github.io/flexhaz/reference/)

## Related Packages

- [`algebraic.dist`](https://github.com/queelius/algebraic.dist): Generic distribution interface
- [`likelihood.model`](https://github.com/queelius/likelihood.model): Likelihood model framework
- [`algebraic.mle`](https://github.com/queelius/algebraic.mle): MLE utilities