{"id":18794310,"url":"https://github.com/abhipatel35/ml-regression-lifecycle","last_synced_at":"2026-04-17T04:33:05.235Z","repository":{"id":217765553,"uuid":"744758176","full_name":"abhipatel35/ML-Regression-Lifecycle","owner":"abhipatel35","description":"Explore the complete lifecycle of a machine learning project focused on regression. This repository covers data acquisition, preprocessing, and training with Linear Regression, Decision Tree Regression, and Random Forest Regression models. Evaluate and compare models using R2 score. Ideal for learning and implementing regression use cases.","archived":false,"fork":false,"pushed_at":"2024-01-18T00:26:50.000Z","size":28,"stargazers_count":0,"open_issues_count":0,"forks_count":0,"subscribers_count":1,"default_branch":"main","last_synced_at":"2025-10-25T18:46:28.803Z","etag":null,"topics":["decision-tree-regression","linear-regression","machine-learning","machine-learning-lifecycle","ml","ml-life","pandas","r2-score","random-forest-regression","regression","sklearn-ensemble","sklearn-library","sklearn-models","sklearn-tree"],"latest_commit_sha":null,"homepage":"","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/abhipatel35.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null}},"created_at":"2024-01-18T00:13:59.000Z","updated_at":"2024-12-25T05:09:28.000Z","dependencies_parsed_at":"2024-01-18T06:23:59.351Z","dependency_job_id":"1b6bfa4a-d57c-47fc-952d-eb643a9bc370","html_url":"https://github.com/abhipatel35/ML-Regression-Lifecycle","commit_stats":null,"previous_names":["abhipatel35/ml-regression-lifecycle"],"tags_count":0,"template":false,"template_full_name":null,"purl":"pkg:github/abhipatel35/ML-Regression-Lifecycle","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/abhipatel35%2FML-Regression-Lifecycle","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/abhipatel35%2FML-Regression-Lifecycle/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/abhipatel35%2FML-Regression-Lifecycle/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/abhipatel35%2FML-Regression-Lifecycle/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/abhipatel35","download_url":"https://codeload.github.com/abhipatel35/ML-Regression-Lifecycle/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/abhipatel35%2FML-Regression-Lifecycle/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":281003915,"owners_count":26428018,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","status":"online","status_checked_at":"2025-10-25T02:00:06.499Z","response_time":81,"last_error":null,"robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":true,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["decision-tree-regression","linear-regression","machine-learning","machine-learning-lifecycle","ml","ml-life","pandas","r2-score","random-forest-regression","regression","sklearn-ensemble","sklearn-library","sklearn-models","sklearn-tree"],"created_at":"2024-11-07T21:28:57.859Z","updated_at":"2025-10-25T18:46:30.099Z","avatar_url":"https://github.com/abhipatel35.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"# Machine Learning Project: Complete Lifecycle for Regression Use Case\n\n## Overview\n\nThis repository presents a comprehensive guide to the end-to-end lifecycle of a machine learning project, focusing on solving a regression problem. The project encompasses key stages such as data acquisition, preprocessing, model training, testing, and evaluation.\n\n### Key Features\n\n- **Data Preparation:**\n  - Load the dataset and perform exploratory data analysis.\n  - Encode categorical features for modeling.\n\n- **Model Training and Evaluation:**\n  - Utilize three regression models: Linear Regression, Decision Tree Regression, and Random Forest Regression.\n  - Evaluate model performance using the R2 score as the evaluation metric.\n\n- **Model Comparison:**\n  - Compare the performance of the three models to identify the most suitable for the regression use case.\n\n### Usage\n\n1. **Clone the Repository:**\n   ```bash\n   git clone https://github.com/abhipatel35/ML-Regression-Lifecycle.git\n   ```\n\n2. **Navigate to the Project Directory:**\n\n3. **Install Dependencies:**\n\n4. **Run the Jupyter Notebook or Python Script:**\n   - Open and run the Jupyter Notebook/Pycharm to execute the Python script `main.py` to explore the complete project.\n\n### Project Structure\n\n- `main.py`: Python script with the main project code and Jupiter notebook/ Pycharm code containing the complete project code with explanations.\n- `insurance.csv`: Sample dataset for the regression use case.\n\n\n## Data Preparation\n\n### Loading the Dataset\n```python\nimport pandas as pd\n\n# Load the dataset into a DataFrame\ndf = pd.read_csv('insurance.csv')\n```\n\n### Exploratory Data Analysis\n```python\n# Display the first few rows of the dataset\nprint(df.head())\n\n# Display the number of rows and columns\nprint(df.shape)\n\n# Display data types of each column\nprint(df.info())\n\n# Statistical summary of numerical features\nprint(df.describe())\n\n# Check for null values\nprint(df.isnull().sum())\n```\n\n### Data Encoding for Categorical Features\n```python\n# Encode categorical features\ndf.replace({'sex': {'male': 0, 'female': 1}}, inplace=True)\ndf.replace({'smoker': {'yes': 0, 'no': 1}}, inplace=True)\ndf.replace({'region': {'southwest': 0, 'southeast': 1, 'northwest': 2, 'northeast': 3}}, inplace=True)\n```\n\n### Separating Dependent and Independent Variables\n```python\n# Separate dependent/target variable (y) and independent features (x)\nx = df.drop(columns=['charges'], axis=1)\ny = df['charges']\n```\n\n### Train-Test Split\n```python\n# Split the data into training and testing sets\nfrom sklearn.model_selection import train_test_split\n\nx_train, x_test, y_train, y_test = train_test_split(x, y, test_size=0.2)\nprint(x_train.shape)\nprint(x_test.shape)\n```\n\n## Model Training and Evaluation\n\n### Linear Regression\n```python\nfrom sklearn.linear_model import LinearRegression\n\n# Create and train the Linear Regression model\nlr = LinearRegression()\nlr.fit(x_train, y_train)\n\n# Make predictions and evaluate\nlr_pred = lr.predict(x_test)\nprint(\"Linear Regression -\u003e\", r2_score(y_test, lr_pred))\n```\n\n### Decision Tree Regression\n```python\nfrom sklearn.tree import DecisionTreeRegressor\n\n# Create and train the Decision Tree Regression model\ndtr = DecisionTreeRegressor()\ndtr.fit(x_train, y_train)\n\n# Make predictions and evaluate\ndtr_pred = dtr.predict(x_test)\nprint(\"Decision Tree Regression -\u003e\", r2_score(y_test, dtr_pred))\n```\n\n### Random Forest Regression\n```python\nfrom sklearn.ensemble import RandomForestRegressor\n\n# Create and train the Random Forest Regression model\nrfr = RandomForestRegressor()\nrfr.fit(x_train, y_train)\n\n# Make predictions and evaluate\nrfr_pred = rfr.predict(x_test)\nprint(\"Random Forest Regression -\u003e\", r2_score(y_test, rfr_pred))\n```\n\n## Model Comparison\nAfter training and evaluating the three models, you can compare their performance using the R2 score. Choose the model that best suits your use case.\n\nOnce satisfied with the model's performance, it can be deployed for real-world applications.\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fabhipatel35%2Fml-regression-lifecycle","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fabhipatel35%2Fml-regression-lifecycle","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fabhipatel35%2Fml-regression-lifecycle/lists"}