{"id":51079065,"url":"https://github.com/maxgfr/regressio","last_synced_at":"2026-06-23T16:33:10.017Z","repository":{"id":349220161,"uuid":"1201515073","full_name":"maxgfr/regressio","owner":"maxgfr","description":"Zero-dependency TypeScript regression, classification \u0026 statistics library. OLS, Ridge, Lasso, Elastic Net, Logistic, KNN, Neural Network + diagnostics + preprocessing. Optional Rust/WASM engine.","archived":false,"fork":false,"pushed_at":"2026-04-04T19:43:32.000Z","size":117,"stargazers_count":0,"open_issues_count":0,"forks_count":0,"subscribers_count":0,"default_branch":"main","last_synced_at":"2026-04-04T22:02:46.401Z","etag":null,"topics":["bun","knn","lasso","linear-regression","logistic-regression","machine-learning","neural-network","ols","regression","ridge-regression","statistics","typescript","wasm","zero-dependencies"],"latest_commit_sha":null,"homepage":null,"language":"TypeScript","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/maxgfr.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null,"notice":null,"maintainers":null,"copyright":null,"agents":null,"dco":null,"cla":null}},"created_at":"2026-04-04T19:29:19.000Z","updated_at":"2026-04-04T19:43:08.000Z","dependencies_parsed_at":null,"dependency_job_id":"4abebdef-b181-4f45-a4bc-4483d3385b23","html_url":"https://github.com/maxgfr/regressio","commit_stats":null,"previous_names":["maxgfr/regressio"],"tags_count":1,"template":false,"template_full_name":null,"purl":"pkg:github/maxgfr/regressio","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/maxgfr%2Fregressio","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/maxgfr%2Fregressio/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/maxgfr%2Fregressio/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/maxgfr%2Fregressio/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/maxgfr","download_url":"https://codeload.github.com/maxgfr/regressio/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/maxgfr%2Fregressio/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":34698696,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-05-26T15:22:16.424Z","status":"online","status_checked_at":"2026-06-23T02:00:07.161Z","response_time":65,"last_error":null,"robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":true,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["bun","knn","lasso","linear-regression","logistic-regression","machine-learning","neural-network","ols","regression","ridge-regression","statistics","typescript","wasm","zero-dependencies"],"created_at":"2026-06-23T16:33:09.907Z","updated_at":"2026-06-23T16:33:10.005Z","avatar_url":"https://github.com/maxgfr.png","language":"TypeScript","funding_links":[],"categories":[],"sub_categories":[],"readme":"# regressio\n\nZero-dependency TypeScript regression, classification \u0026 statistics library with full statistical outputs, diagnostics, and preprocessing. Ships with an optional Rust/WASM engine for accelerated linear algebra.\n\n## Install\n\n```bash\nbun add regressio\n# or\nnpm install regressio\n# or\npnpm add regressio\n```\n\n## Quick Start\n\n```typescript\nimport { LinearRegression } from 'regressio';\n\nconst model = new LinearRegression();\nmodel.fit([1, 2, 3, 4, 5], [2.1, 3.9, 6.2, 7.8, 10.1]);\n\nconsole.log(model.coefficients);  // [2.02]\nconsole.log(model.intercept);     // 0.06\nconsole.log(model.predict([6]));  // [12.18]\nconsole.log(model.summary());     // R-style formatted summary table\n```\n\n## Models\n\n### Regression\n\n| Model | Class | What it does |\n|-------|-------|--------------|\n| **OLS** | `LinearRegression` | Fits a linear relationship between features and target using Ordinary Least Squares solved via QR decomposition. The foundational regression method. |\n| **Polynomial** | `PolynomialRegression` | Fits non-linear curves by expanding a single feature into polynomial terms (x, x², x³, ...) then applying OLS. |\n| **Ridge (L2)** | `RidgeRegression` | Adds an L2 penalty (sum of squared coefficients) to OLS to handle multicollinearity and prevent overfitting. Shrinks coefficients toward zero but never exactly to zero. |\n| **Lasso (L1)** | `LassoRegression` | Adds an L1 penalty (sum of absolute coefficients) via coordinate descent. Forces some coefficients to exactly zero, performing automatic feature selection. |\n| **Elastic Net** | `ElasticNet` | Combines L1 and L2 penalties. Balances Lasso's feature selection with Ridge's stability for correlated features. |\n| **WLS** | `WeightedRegression` | Weighted Least Squares. Assigns different importance to each observation. Useful when some data points are more reliable than others. |\n| **Robust** | `RobustRegression` | Resistant to outliers. Uses Iteratively Reweighted Least Squares (IRLS) with Huber or Tukey bisquare M-estimators to downweight extreme values. |\n\n### Classification\n\n| Model | Class | What it does |\n|-------|-------|--------------|\n| **Logistic** | `LogisticRegression` | Binary classification (0/1). Models the probability of class membership using a sigmoid function, fitted via Newton-Raphson/IRLS. |\n| **Multiclass Logistic** | `MulticlassLogisticRegression` | Extends logistic regression to K classes using softmax. Fitted via gradient descent on the cross-entropy loss. |\n| **K-Nearest Neighbors** | `KNearestNeighbors` | Non-parametric method. Predicts by majority vote (classification) or mean (regression) of the k closest training points. Supports Euclidean and Manhattan distances. |\n\n### Neural Network\n\n| Model | Class | What it does |\n|-------|-------|--------------|\n| **Feedforward NN** | `NeuralNetwork` | Multi-layer perceptron with backpropagation. Configurable hidden layers, activations (relu, sigmoid, tanh, softmax), and learning rate. Supports both regression and classification tasks. |\n\n### Usage\n\n```typescript\nimport {\n  LinearRegression,\n  PolynomialRegression,\n  RidgeRegression,\n  LassoRegression,\n  ElasticNet,\n  WeightedRegression,\n  RobustRegression,\n  LogisticRegression,\n  MulticlassLogisticRegression,\n  KNearestNeighbors,\n  NeuralNetwork,\n} from 'regressio';\n\n// --- Regression ---\n\n// OLS: multiple regression\nconst ols = new LinearRegression();\nols.fit([[1, 2], [3, 4], [5, 6]], [10, 22, 34]);\n\n// Polynomial: fit a cubic curve\nconst poly = new PolynomialRegression({ degree: 3 });\npoly.fit([1, 2, 3, 4, 5], [1, 8, 27, 64, 125]);\n\n// Ridge: regularized regression for correlated features\nconst ridge = new RidgeRegression({ alpha: 0.5 });\nridge.fit(X, y);\n\n// Lasso: automatic feature selection\nconst lasso = new LassoRegression({ alpha: 0.1 });\nlasso.fit(X, y);\n// Some coefficients will be exactly 0\n\n// Elastic Net: mix of L1 and L2\nconst enet = new ElasticNet({ alpha: 0.1, l1Ratio: 0.5 });\nenet.fit(X, y);\n\n// Weighted Least Squares: different reliability per observation\nconst wls = new WeightedRegression();\nwls.fit(X, y, weights);\n\n// Robust: resistant to outliers\nconst robust = new RobustRegression({ method: 'huber' });\nrobust.fit(X, y);\n\n// --- Classification ---\n\n// Binary logistic regression\nconst logit = new LogisticRegression();\nlogit.fit(X, y); // y must be 0/1\nlogit.predictProbability(Xnew); // [0.12, 0.87, ...]\n\n// Multiclass logistic regression (softmax)\nconst multi = new MulticlassLogisticRegression({ learningRate: 0.05 });\nmulti.fit(X, y); // y = 0, 1, 2, ...\nmulti.predictProbability(Xnew); // [[0.7, 0.2, 0.1], ...]\n\n// K-Nearest Neighbors (classification or regression)\nconst knn = new KNearestNeighbors({ k: 5, mode: 'classification' });\nknn.fit(X, y);\nknn.predict(Xnew);\n\n// --- Neural Network ---\n\n// Regression with a neural network\nconst nn = new NeuralNetwork({\n  layers: [\n    { units: 16, activation: 'relu' },\n    { units: 8, activation: 'relu' },\n  ],\n  learningRate: 0.01,\n  epochs: 200,\n  task: 'regression',\n});\nnn.fit(X, y);\nnn.predict(Xnew);\n\n// Classification with a neural network\nconst clf = new NeuralNetwork({\n  layers: [{ units: 10, activation: 'sigmoid' }],\n  learningRate: 0.1,\n  epochs: 100,\n  task: 'classification',\n});\nclf.fit(X, y); // y = 0, 1, 2, ...\nclf.predict(Xnew);\n```\n\n## Statistical Outputs\n\nEvery linear model (OLS, Ridge, Lasso, Elastic Net, WLS, Robust, Polynomial) provides `statistics()` and `summary()`:\n\n```typescript\nconst stats = model.statistics();\n// {\n//   rSquared,              -- proportion of variance explained (0 to 1)\n//   adjustedRSquared,      -- R² penalized for number of predictors\n//   standardErrors,        -- uncertainty of each coefficient estimate\n//   tStatistics,           -- coefficient / standard error for each predictor\n//   pValues,               -- probability of observing the t-stat under H0 (no effect)\n//   confidenceIntervals,   -- 95% confidence range for each coefficient\n//   fStatistic,            -- overall model significance test\n//   fPValue,               -- p-value for the F-test\n//   residualStandardError, -- estimated standard deviation of residuals\n//   aic,                   -- Akaike Information Criterion (lower = better fit/complexity trade-off)\n//   bic,                   -- Bayesian Information Criterion (stronger complexity penalty than AIC)\n//   degreesOfFreedom,      -- n - k (observations minus parameters)\n//   nObservations,         -- number of data points\n// }\n\nconsole.log(model.summary());\n// Coefficients:\n//                 Estimate    Std. Error  t value   Pr(\u003e|t|)\n// (Intercept)     0.0600      0.1200      0.50      0.6300\n// x1              2.0200      0.0400      50.20     0.0000 ***\n// ---\n// Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1\n```\n\nBinary logistic regression provides classification metrics:\n\n```typescript\nconst stats = logit.statistics();\n// { accuracy, precision, recall, f1Score, confusionMatrix,\n//   pseudoRSquared, logLikelihood, aic, bic }\n```\n\nMulticlass logistic regression provides per-class metrics:\n\n```typescript\nconst stats = multi.statistics();\n// { accuracy, precision (per class), recall (per class),\n//   nClasses, logLikelihood }\n```\n\n## Diagnostics\n\nFunctions to validate model assumptions and detect problems.\n\n| Function | What it does |\n|----------|--------------|\n| `residualDiagnostics(X, y, yHat)` | Returns raw residuals, studentized residuals, Cook's distance, and leverage for each observation. |\n| `studentizedResiduals(X, y, yHat)` | Residuals scaled by their estimated standard deviation. Values \u003e 2-3 suggest outliers. |\n| `cooksDistance(X, y, yHat)` | Measures how much each observation influences the fitted model. Values \u003e 4/n flag influential points. |\n| `leverage(X)` | Hat matrix diagonal. Measures how far each observation's features are from the center. High leverage = unusual feature values. |\n| `durbinWatson(residuals)` | Tests for autocorrelation in residuals. Returns statistic in [0,4]: ~2 = no autocorrelation, \u003c2 = positive, \u003e2 = negative. Critical for time series. |\n| `breuschPagan(X, residuals)` | Tests for heteroscedasticity (non-constant variance). Low p-value = variance depends on X, meaning standard errors are unreliable. |\n| `shapiroWilk(data)` | Tests whether data follows a normal distribution. Low p-value = non-normal. Important because p-values and CIs assume normal residuals. |\n| `vif(X)` | Variance Inflation Factor for each feature. VIF \u003e 10 signals multicollinearity (features are too correlated). |\n| `correlationMatrix(X)` | Pairwise Pearson correlation matrix. Pairs with |r| \u003e 0.9 suggest redundant features. |\n| `conditionNumber(X)` | Ratio of largest to smallest singular value of X. Values \u003e 30 signal numerical instability from multicollinearity. |\n\n```typescript\nimport {\n  residualDiagnostics, leverage, cooksDistance, studentizedResiduals,\n  durbinWatson, breuschPagan, shapiroWilk,\n  vif, correlationMatrix, conditionNumber,\n} from 'regressio';\n\nconst diag = residualDiagnostics(X, y, yHat);\nconst dw = durbinWatson(model.residuals());\nconst bp = breuschPagan(X, model.residuals());\nconst sw = shapiroWilk(model.residuals());\nconst vifs = vif(X);\nconst corr = correlationMatrix(X);\nconst kappa = conditionNumber(X);\n```\n\n## Preprocessing\n\nFunctions to prepare data before fitting models.\n\n| Function | What it does |\n|----------|--------------|\n| `standardize(X)` | Z-score normalization: transforms each feature to mean=0, std=1. Essential for Lasso/Ridge/Elastic Net and neural networks. |\n| `unstandardize(X, params)` | Reverses standardization back to the original scale. |\n| `normalize(X)` | Min-max scaling: transforms each feature to [0, 1] range. |\n| `unnormalize(X, params)` | Reverses normalization back to the original scale. |\n| `oneHotEncode(column, categories?, dropFirst?)` | Converts categorical values to binary columns. Use `dropFirst=true` to avoid the multicollinearity trap. |\n| `polynomialFeatures(X, degree)` | Generates polynomial terms (x, x², x³, ...) for each feature. Use with `LinearRegression` for polynomial fitting with multiple features. |\n| `interactionFeatures(X, pairs?)` | Generates interaction terms (xi * xj) for all or specified feature pairs. |\n| `dropMissing(X, y?)` | Removes rows containing NaN or null values. |\n| `imputeMean(X)` | Replaces NaN values with the column mean. |\n| `imputeMedian(X)` | Replaces NaN values with the column median. More robust to outliers than mean imputation. |\n\n```typescript\nimport {\n  standardize, unstandardize, normalize, unnormalize,\n  oneHotEncode, polynomialFeatures, interactionFeatures,\n  dropMissing, imputeMean, imputeMedian,\n} from 'regressio';\n\nconst { transformed, means, stds } = standardize(X);\nconst original = unstandardize(transformed, { means, stds });\nconst { transformed: normed, mins, maxs } = normalize(X);\nconst dummies = oneHotEncode(['cat', 'dog', 'cat'], undefined, true);\nconst polyX = polynomialFeatures(X, 3);\nconst interX = interactionFeatures(X);\nconst clean = dropMissing(X, y);\nconst imputed = imputeMean(X);\n```\n\n## Prediction Intervals\n\nFunctions to quantify prediction uncertainty.\n\n| Function | What it does |\n|----------|--------------|\n| `confidenceInterval(X, y, yHat, newX, newYHat)` | Confidence interval on the **mean** prediction. Answers: \"where is the true regression line?\" Narrower near the center of the training data. |\n| `predictionInterval(X, y, yHat, newX, newYHat)` | Prediction interval for a **new individual** observation. Always wider than the confidence interval because it includes observation noise. |\n| `bootstrapCoefficients(X, y, nBootstrap?)` | Non-parametric bootstrap: resamples data with replacement, refits the model many times, and returns empirical confidence intervals on coefficients. No distributional assumptions. |\n\n```typescript\nimport { confidenceInterval, predictionInterval, bootstrapCoefficients } from 'regressio';\n\nconst ci = confidenceInterval(X, y, yHat, newX, newYHat);\n// [{ predicted, lower, upper }, ...]\n\nconst pi = predictionInterval(X, y, yHat, newX, newYHat);\n// Always wider than ci\n\nconst boot = bootstrapCoefficients(X, y, 1000);\n// { coefficients, confidenceIntervals, standardErrors }\n```\n\n## Advanced: Matrix Class\n\nLow-level matrix operations for advanced users. Backed by `Float64Array` in row-major order.\n\n```typescript\nimport { Matrix } from 'regressio';\n\nconst A = Matrix.fromArray([[1, 2], [3, 4]]);\nconst B = Matrix.identity(2);\nconst C = A.multiply(B);\nconsole.log(C.determinant());  // -2\nconsole.log(C.trace());        // 5\nconsole.log(C.transpose().toArray());\n```\n\n## WASM Acceleration\n\nregressio ships with a pre-compiled Rust/WASM engine that activates automatically — no configuration needed. When the WASM binary is available, heavy computations are dispatched to compiled Rust code for significantly faster execution.\n\n**Accelerated operations:**\n- Matrix: multiply, transpose, add, subtract, scale, dot product, norm, determinant\n- Decompositions: QR, Cholesky, SVD, eigenvalues (tridiagonal QL)\n- Solvers: forward/back substitution\n- Models: Lasso/Elastic Net coordinate descent, logistic regression IRLS, softmax, KNN distance matrices\n- Diagnostics: correlation matrix, VIF (via correlation matrix inverse)\n- Predictions: bootstrap OLS (1000+ resamples in a single WASM call)\n\nIf WASM is unavailable (e.g. unsupported runtime), all operations fall back silently to pure TypeScript.\n\n```typescript\nimport { isWasmActive } from 'regressio';\n\nconsole.log(isWasmActive()); // true if WASM loaded\n\n// Everything just works — WASM is used transparently\nconst model = new LinearRegression();\nmodel.fit(X, y); // QR decomposition runs in Rust\n```\n\n### Rebuilding WASM\n\nThe pre-built WASM binary is included in the package. To rebuild from Rust source (requires [Rust](https://rustup.rs/) with `wasm32-unknown-unknown` target):\n\n```bash\nbun run build:wasm\n```\n\n## License\n\nMIT\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fmaxgfr%2Fregressio","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fmaxgfr%2Fregressio","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fmaxgfr%2Fregressio/lists"}