Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/kurtlawrence/fitme
CLI curve fitting tool. Parameterise an equation from a CSV dataset.
https://github.com/kurtlawrence/fitme
cli curve-fitting regression
Last synced: about 2 months ago
JSON representation
CLI curve fitting tool. Parameterise an equation from a CSV dataset.
- Host: GitHub
- URL: https://github.com/kurtlawrence/fitme
- Owner: kurtlawrence
- License: mit
- Created: 2023-01-18T00:39:32.000Z (almost 2 years ago)
- Default Branch: main
- Last Pushed: 2023-09-30T02:34:30.000Z (over 1 year ago)
- Last Synced: 2024-08-09T02:36:53.614Z (5 months ago)
- Topics: cli, curve-fitting, regression
- Language: Rust
- Homepage: https://docs.rs/fitme
- Size: 112 KB
- Stars: 2
- Watchers: 2
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- Funding: .github/FUNDING.yml
Awesome Lists containing this project
README
# fitme
CLI curve fitting tool. Parameterise an equation from a CSV dataset.> `fitme` is primarily a CLI tool, and this README details the CLI use.
>
> If one is wanting to use `fitme` as a library, please see the [API docs](https://docs.rs/fitme).---
```plaintext
> fitme y "m * x + c" tests/file1.csv
──────────────────────────────────────────────
Parameter Value Standard Error t-value
══════════════════════════════════════════════
c 3.209 0.013 230.3
──────────────────────────────────────────────
m 1.770 0.011 149.0
──────────────────────────────────────────────
Number of observations: 10.0
Root Mean Squared Residual error: 0.043
R-sq Adjusted: 0.999
```# Features
- simple interface
- fast
- flexible equations
- helpful error messages# Installation
Currently, only installation from source is supported:
```plaintext
# using crates.io
cargo install fitme# using github
cargo install --git https://github.com/kurtlawrence/fitme
```# Usage
- `fitme --help` for detailed help.
`fitme` requires just two arguments, the target column to fit against, and the mathematical
expression. The third optional argument specifies the file to read the CSV from.
`fitme` uses a least-squares fitting approach.Let's fit a linear regression to the following data:
| y | x |
| --- | --- |
| 1.9000429E-01 | -1.7237128E+00 |
| 6.5807428E+00 | 1.8712276E+00 |
| 1.4582725E+00 | -9.6608055E-01 |
| 2.7270851E+00 | -2.8394297E-01 |
| 5.5969253E+00 | 1.3416969E+00 |
| 5.6249280E+00 | 1.3757038E+00 |
| 0.787615 | -1.3703436E+00 |
| 3.2599759E+00 | 4.2581975E-02 |
| 2.9771762E+00 | -1.4970151E-01 |
| 4.5936475E+00 | 8.2065094E-01 |Equation: `y = m * x + c`
Here the:
- _target_: `y`
- _variables_: `x`
- _parameters_: `m`, `c`To run a fit, simply use `fitme y "m * x + c" test-file.csv`:
```plaintext
> fitme y "m * x + c" test-file.csv
──────────────────────────────────────────────
Parameter Value Standard Error t-value
══════════════════════════════════════════════
c 3.209 0.013 230.3
──────────────────────────────────────────────
m 1.770 0.011 149.0
──────────────────────────────────────────────
Number of observations: 10.0
Root Mean Squared Residual error: 0.043
R-sq Adjusted: 0.999
```Notice that `fitme` will automatically match column names in the equation, binding them as
**variables**. Unmatched variables become **parameters**.## Multi Parameters
`fitme` is useful for fitting multiple least squares linear regressions:
```plaintext
> fitme sepalLength "a * petalLength + b * sepalWidth + c * petalWidth + d" iris.csv
───────────────────────────────────────────────
Parameter Value Standard Error t-value
═══════════════════════════════════════════════
a 0.711 0.056 12.51
───────────────────────────────────────────────
b 0.654 0.066 9.788
───────────────────────────────────────────────
c -0.562 0.127 -4.410
───────────────────────────────────────────────
d 1.845 0.251 7.342
───────────────────────────────────────────────
Number of observations: 150.0
Root Mean Squared Residual error: 0.314
R-sq Adjusted: 0.855
```## Flexible Output
Alter the output via the `--out` switch.
### CSV
```plaintext
> fitme y "m * x + c" file1.csv -o=csv -n
Parameter,Value,Standard Error,t-value
m,1.7709542029456211,0.011883297834310212,149.02884936809457
c,3.2099657167997013,0.013936863525869892,230.32195951702457
```### Markdown
```plaintext
> fitme y "m * x + c" file1.csv -o=md -n
| Parameter | Value | Standard Error | t-value |
|-----------|-------|----------------|---------|
| c | 3.209 | 0.013 | 230.3 |
| m | 1.770 | 0.011 | 149.0 |
```### JSON
```plaintext
> fitme y "m * x + c" file1.csv -o=json -n
{"parameter_names":["m","c"],"parameter_values":[1.7709542029456211,3.2099657167997013],"n":10,"xerrs":[0.011883297834310212,0.013936863525869892],"rmsr":0.04392493014188053,"rsq":0.9995948974725735,"tvals":[149.02884936809457,230.32195951702457]}
```### + more!
# Mathematical Expressions
- `+,-,*,/`
- `%`: remainder
- `^`: power
- `pi, e`
- `sqrt(), abs()`
- `exp(), ln(), log()`
- `sin(), cos(), tan()`
- `sinh(), cosh(), tanh()`
- `floor(), ceil(), round()`🔬 If you need more math support, please [raise an issue](https://github.com/kurtlawrence/fitme/issues).
# Example Equations
## Linear
- Equation: `y = Ax + B`
- Columns: `y, x`
- Parameters: `A, B````bash
> fitme y "Ax + B"
```## Multiple Linear Regression
- Equation: `y = P0 * x0 + P1 * x1 + ... + Pn * xn + C`
- Columns: `y, x0, x1, ... , xn`
- Parameters: `P0, P1, ... , Pn, C````bash
> fitme y "P0 * x0 + P1 * x1 + ... + Pn * xn + C"
```## Normal Distribution
The goal is to fit to a CDF, so the input CSV will have _P_ as the probability [0,1], and
_x_ as the variable.$$P = {1\over2} \bigg\lbrack {1 + erf \Big( {{x-\mu}\over{\sigma^2\sqrt2}} \Big)}\bigg\rbrack$$
We can [approximate the `erf` function with](https://math.stackexchange.com/questions/321569/approximating-the-error-function-erf-by-analytical-functions):
$$erf(x) \approx \tanh \big( {\sqrt{\pi}\log(2)x} \big)$$
So:
```math
P = {1\over2} \bigg\lbrack
{1 + \tanh \Big(
{{(x-\mu)\sqrt\pi\log(2)}\over{\sigma^2\sqrt2}}
\Big)}
\bigg\rbrack
```This transforms into the expression:
```plaintext
0.5 * (1 + tanh(((x - Mean) * sqrt(pi) * log(2)) / (Stdev^2 * sqrt(2))))Parameters: Mean, Stdev
Variables: x
```And to fit:
```bash
> fitme P "0.5 * (1 + tanh(((x - Mean) * sqrt(pi) * log(2)) / (Stdev^2 * sqrt(2))))"
```