{"id":31070982,"url":"https://github.com/potakaaa/regression-from-scratch","last_synced_at":"2025-09-15T23:55:01.285Z","repository":{"id":314635413,"uuid":"1054908569","full_name":"potakaaa/regression-from-scratch","owner":"potakaaa","description":"Linear regression formulas built from scratch.","archived":false,"fork":false,"pushed_at":"2025-09-13T17:15:53.000Z","size":8,"stargazers_count":0,"open_issues_count":0,"forks_count":0,"subscribers_count":0,"default_branch":"development","last_synced_at":"2025-09-13T19:19:56.157Z","etag":null,"topics":[],"latest_commit_sha":null,"homepage":null,"language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/potakaaa.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null,"notice":null,"maintainers":null,"copyright":null,"agents":null,"dco":null,"cla":null}},"created_at":"2025-09-11T13:51:06.000Z","updated_at":"2025-09-13T17:15:56.000Z","dependencies_parsed_at":"2025-09-13T19:33:05.057Z","dependency_job_id":null,"html_url":"https://github.com/potakaaa/regression-from-scratch","commit_stats":null,"previous_names":["potakaaa/regression-from-scratch"],"tags_count":null,"template":false,"template_full_name":null,"purl":"pkg:github/potakaaa/regression-from-scratch","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/potakaaa%2Fregression-from-scratch","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/potakaaa%2Fregression-from-scratch/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/potakaaa%2Fregression-from-scratch/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/potakaaa%2Fregression-from-scratch/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/potakaaa","download_url":"https://codeload.github.com/potakaaa/regression-from-scratch/tar.gz/refs/heads/development","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/potakaaa%2Fregression-from-scratch/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":275338801,"owners_count":25447036,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","status":"online","status_checked_at":"2025-09-15T02:00:09.272Z","response_time":75,"last_error":null,"robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":true,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":[],"created_at":"2025-09-15T23:55:00.062Z","updated_at":"2025-09-15T23:55:01.274Z","avatar_url":"https://github.com/potakaaa.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"# Linear Regression From Scratch\n\nA complete implementation of linear regression (single and multiple variables) built entirely from scratch without using any machine learning libraries like scikit-learn.\n\n## 📂 Project Structure\n\n```\n├── main.py              # Main entry point and pipeline orchestration\n├── data/\n│   ├── __init__.py\n│   └── loader.py        # Data loading functionality\n├── utils/\n│   ├── __init__.py\n│   ├── data_split.py    # Train/test splitting utilities\n│   └── preprocessing.py # Feature normalization/scaling\n├── model/\n│   ├── __init__.py\n│   ├── parameters.py    # Weight initialization\n│   ├── predict.py       # Hypothesis function\n│   ├── gradients.py     # Gradient computation\n│   ├── update.py        # Parameter updates\n│   └── train.py         # Main training loop\n├── metrics/\n│   ├── __init__.py\n│   ├── loss.py          # MSE, RMSE calculations\n│   └── evaluation.py    # R², NRMSE metrics\n└── visualization/\n    ├── __init__.py\n    └── plot.py          # Training curves and regression plots\n```\n\n## 🚀 Implementation Pipeline\n\n1. **Data Preparation**\n\n   - Load data from CSV/text files\n   - Split into training and testing sets\n   - Optional feature normalization\n\n2. **Model Setup**\n\n   - Initialize weights and bias parameters\n   - Define hypothesis function\n\n3. **Training**\n\n   - Implement gradient descent algorithm\n   - Compute cost function (MSE)\n   - Update parameters iteratively\n\n4. **Evaluation**\n\n   - Calculate performance metrics (R², RMSE)\n   - Validate on test set\n\n5. **Visualization** (Optional)\n   - Plot training loss curves\n   - Visualize regression line vs actual data\n\n## 📋 Function Signatures\n\n### Data Module\n\n- `load_data(filepath)` → Load and return X, y\n- `train_test_split(X, y, test_size=0.2, seed=42)` → Split data\n\n### Utils Module\n\n- `normalize(X)` → Scale features to [0,1] or standardize\n\n### Model Module\n\n- `initialize_weights(n_features)` → Return weights, bias\n- `predict(X, weights, bias)` → Return predictions\n- `compute_gradients(X, y, weights, bias)` → Return gradients\n- `update_weights(weights, bias, gradients, lr)` → Update parameters\n- `train(X, y, lr, epochs)` → Train model and return parameters\n\n### Metrics Module\n\n- `mse(y_true, y_pred)` → Mean Squared Error\n- `rmse(y_true, y_pred)` → Root Mean Squared Error\n- `r2_score(y_true, y_pred)` → R² coefficient\n- `nrmse(y_true, y_pred)` → Normalized RMSE\n\n### Visualization Module\n\n- `plot_loss(history)` → Plot training loss curve\n- `plot_regression_line(X, y, y_pred)` → Scatter plot with regression line\n\n## 🎯 Key Features\n\n- **No external ML libraries**: Pure Python/NumPy implementation\n- **Modular design**: Each component in separate files\n- **Educational focus**: Step-by-step implementation for learning\n- **Multiple metrics**: Comprehensive evaluation suite\n- **Visualization support**: Training progress and results plotting\n\n## 🏃 Getting Started\n\n1. Implement functions in each module (follow the comments in each file)\n2. Run `python main.py` to execute the complete pipeline\n3. Modify hyperparameters and observe results\n4. Use visualization functions to understand model behavior\n\n## 📊 Mathematical Foundation\n\nThe implementation follows the standard linear regression approach:\n\n**Hypothesis**: `h(x) = w₁x₁ + w₂x₂ + ... + wₙxₙ + b`\n\n**Cost Function**: `J(w,b) = (1/(2m)) Σ(h(xⁱ) - yⁱ)²`\n\n**Gradient Descent**:\n\n- `w := w - α * ∂J/∂w`\n- `b := b - α * ∂J/∂b`\n\nWhere:\n\n- `m` = number of training examples\n- `α` = learning rate\n- `w` = weight parameters\n- `b` = bias term\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fpotakaaa%2Fregression-from-scratch","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fpotakaaa%2Fregression-from-scratch","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fpotakaaa%2Fregression-from-scratch/lists"}