Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.

Awesome Lists | Featured Topics | Projects

https://github.com/ebadshabbir/company_profit-onehotencoding-

This project uses multiple linear regression to predict startup profits based on spending and location data from the **50 Startups** dataset. It includes data preprocessing, model training, and performance evaluation using Scikit-Learn.
https://github.com/ebadshabbir/company_profit-onehotencoding-

jupyter-notebook machine-learning matplotlib multiple-linear-regression onehot-encoding pandas pyhton regression sklearn

Last synced: 17 days ago
JSON representation

This project uses multiple linear regression to predict startup profits based on spending and location data from the **50 Startups** dataset. It includes data preprocessing, model training, and performance evaluation using Scikit-Learn.

Awesome Lists containing this project

README

        

# Multiple Linear Regression on 50 Startups Dataset

This project demonstrates how to perform multiple linear regression using Python and Scikit-Learn on the **50 Startups** dataset. The dataset contains data on 50 startups with information on R&D Spend, Administration, Marketing Spend, and State, along with the corresponding profit.

## Project Overview

In this project, we:

1. **Preprocess the data**: Use OneHotEncoding to handle categorical variables (State).
2. **Avoid the dummy variable trap**: Remove one of the one-hot encoded columns.
3. **Split the data**: Divide the dataset into training and test sets.
4. **Fit the multiple linear regression model**: Train the model on the training set.
5. **Evaluate the model**: Measure the accuracy of the model using training and test scores.

## Libraries and Dependencies

- `numpy`: For array operations
- `pandas`: For data handling
- `matplotlib`: For plotting (not used in this case, but imported)
- `scikit-learn`: For machine learning tasks such as encoding, splitting, and regression
```bash
pip install numpy pandas matplotlib scikit-learn

Cloning and Running the Project
Clone this repository to your local machine:

```bash
git clone https://github.com/EbadShabbir/50-startups-regression.git
Navigate to the project directory:

```bash

cd 50-startups-regression
Ensure that you have the 50_Startups.csv dataset in the same directory as the script, or adjust the dataset path accordingly in the code.

Run the Python script:

```bash

python regression_model.py