{"id":18482424,"url":"https://github.com/professorlearncode/linear-regression-model","last_synced_at":"2025-05-13T20:23:42.738Z","repository":{"id":253592560,"uuid":"843961166","full_name":"ProfessorlearnCode/Linear-Regression-Model","owner":"ProfessorlearnCode","description":"This code implements a linear regression model to predict diabetes progression based on the diabetes dataset. The model is trained and evaluated, with results visualized through scatter and residual plots.","archived":false,"fork":false,"pushed_at":"2024-08-18T00:18:58.000Z","size":5,"stargazers_count":0,"open_issues_count":0,"forks_count":0,"subscribers_count":1,"default_branch":"main","last_synced_at":"2024-12-25T13:40:40.272Z","etag":null,"topics":[],"latest_commit_sha":null,"homepage":null,"language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/ProfessorlearnCode.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2024-08-18T00:16:58.000Z","updated_at":"2024-08-18T00:19:02.000Z","dependencies_parsed_at":"2024-08-18T01:26:20.646Z","dependency_job_id":"dec6a7c2-1ae5-423f-b4fe-af3ab30757a1","html_url":"https://github.com/ProfessorlearnCode/Linear-Regression-Model","commit_stats":null,"previous_names":["professorlearncode/linear-regression-model"],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/ProfessorlearnCode%2FLinear-Regression-Model","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/ProfessorlearnCode%2FLinear-Regression-Model/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/ProfessorlearnCode%2FLinear-Regression-Model/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/ProfessorlearnCode%2FLinear-Regression-Model/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/ProfessorlearnCode","download_url":"https://codeload.github.com/ProfessorlearnCode/Linear-Regression-Model/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":239198640,"owners_count":19598633,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":[],"created_at":"2024-11-06T12:28:26.622Z","updated_at":"2025-02-16T21:26:59.713Z","avatar_url":"https://github.com/ProfessorlearnCode.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"### Documentation for Linear Regression Model on Diabetes Dataset\n\n#### Overview\nThis repository contains a Python implementation of a linear regression model used to predict diabetes progression based on a set of medical features. The model is trained on the diabetes dataset from the `sklearn` library and evaluated using various metrics. Visualizations are included to help assess the model's performance.\n\n#### Prerequisites\nBefore running the code, ensure you have the following Python libraries installed:\n- `numpy`\n- `matplotlib`\n- `pandas`\n- `seaborn`\n- `scikit-learn`\n\nYou can install the necessary libraries using the following command:\n```bash\npip install numpy matplotlib pandas seaborn scikit-learn\n```\n\n#### Code Breakdown\n\n1. **Importing Libraries**\n   ```python\n   import numpy as np\n   import matplotlib.pyplot as plt\n   import pandas as pd\n   import seaborn as sns\n   from sklearn import datasets\n   from sklearn.linear_model import LinearRegression\n   from sklearn.model_selection import train_test_split\n   from sklearn.metrics import mean_squared_error, r2_score\n   ```\n   This section imports all the necessary libraries for data handling, visualization, and model implementation.\n\n2. **Loading the Dataset**\n   ```python\n   diabetes = datasets.load_diabetes()\n   ```\n   The diabetes dataset is loaded from the `sklearn` library. This dataset includes 10 baseline variables (age, sex, BMI, etc.) used to predict the progression of diabetes one year after baseline.\n\n3. **Splitting the Data**\n   ```python\n   X = diabetes.data\n   Y = diabetes.target\n   X_train, X_test, Y_train, Y_test = train_test_split(X, Y, test_size=0.2, random_state=42)\n   ```\n   The dataset is split into features (`X`) and target (`Y`). The data is further divided into training (80%) and testing (20%) sets using `train_test_split`.\n\n4. **Model Initialization and Training**\n   ```python\n   model = LinearRegression()\n   model.fit(X_train, Y_train)\n   ```\n   A linear regression model is initialized and trained on the training data.\n\n5. **Making Predictions**\n   ```python\n   Y_prediction = model.predict(X_test)\n   ```\n   The model makes predictions on the test data.\n\n6. **Model Evaluation**\n   ```python\n   mse = mean_squared_error(Y_test, Y_prediction)\n   r2 = r2_score(Y_test, Y_prediction)\n   print(\"Coefficients:\", model.coef_)\n   print(\"Intercept: \", model.intercept_)\n   print(\"Mean Square Error: %.2f\" % mse)\n   print(\"R² Score: %.2f\" % r2)\n   ```\n   The model's performance is evaluated using Mean Square Error (MSE) and R² Score. The model's coefficients and intercept are also printed.\n\n7. **Visualizing Results**\n   - **Actual vs Predicted Values**\n     ```python\n     sns.scatterplot(x=Y_test, y=Y_prediction, alpha=0.7)\n     plt.xlabel('Actual Values')\n     plt.ylabel('Predicted Values')\n     plt.title('Actual vs Predicted Values')\n     plt.show()\n     ```\n     A scatter plot is created to visualize the relationship between actual and predicted values.\n\n   - **Residual Plot**\n     ```python\n     residuals = Y_test - Y_prediction\n     sns.scatterplot(x=Y_prediction, y=residuals, alpha=0.7)\n     plt.xlabel('Predicted Values')\n     plt.ylabel('Residuals')\n     plt.title('Residual Plot')\n     plt.axhline(0, color='red', linestyle='--')\n     plt.show()\n     ```\n     A residual plot is used to check for any patterns in the residuals, which can indicate model bias.\n\n#### Conclusion\nThis project demonstrates the implementation of a linear regression model to predict diabetes progression. The code includes steps for data preparation, model training, evaluation, and visualization, providing a comprehensive approach to understanding the model's performance.\n\n#### Future Improvements\n- **Feature Engineering**: Explore additional feature engineering techniques to improve model accuracy.\n- **Advanced Models**: Experiment with more complex models like Ridge or Lasso regression for potentially better performance.\n\n#### License\nThis project is licensed under the MIT License - see the [LICENSE](LICENSE) file for details.\n\n---\n\nFeel free to customize this documentation according to your specific needs!\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fprofessorlearncode%2Flinear-regression-model","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fprofessorlearncode%2Flinear-regression-model","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fprofessorlearncode%2Flinear-regression-model/lists"}