Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/khaledashrafh/linear-regression
This program implements linear regression from scratch using the gradient descent algorithm in Python. It predicts car prices based on selected features and uses a dataset of cars with their respective prices.
https://github.com/khaledashrafh/linear-regression
data-preprocessing dataset feature-selection gredient-decent learning-rate linear-regression pyhton3 regression regression-models
Last synced: about 2 months ago
JSON representation
This program implements linear regression from scratch using the gradient descent algorithm in Python. It predicts car prices based on selected features and uses a dataset of cars with their respective prices.
- Host: GitHub
- URL: https://github.com/khaledashrafh/linear-regression
- Owner: KhaledAshrafH
- License: mit
- Created: 2022-12-10T21:25:37.000Z (about 2 years ago)
- Default Branch: main
- Last Pushed: 2023-08-24T00:01:04.000Z (over 1 year ago)
- Last Synced: 2023-08-24T00:31:12.683Z (over 1 year ago)
- Topics: data-preprocessing, dataset, feature-selection, gredient-decent, learning-rate, linear-regression, pyhton3, regression, regression-models
- Language: Python
- Homepage:
- Size: 17.6 KB
- Stars: 1
- Watchers: 1
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- License: LICENSE.md
Awesome Lists containing this project
README
# Linear Regression from Scratch in Python
This program implements linear regression from scratch using the gradient descent algorithm in Python. It predicts car prices based on selected features and uses a dataset of cars with their respective prices.
## Dataset
The program uses the "car_data.csv" dataset, which contains 205 records of cars with 25 features per record in addition to 1 target column. These features include the car size and dimensions, the fuel system and fuel type used by the car, the engine size and type, the horsepower, etc. The final column (i.e., the target) is the car price
## Features Selection
The program selects four numerical features that are positively or negatively correlated to the car price. These features are:
- Curb weight
- Engine size
- City MPG
- Highway MPG
- Price`2 positive and 2 negative`
## Data Preprocessing
Before applying linear regression, the program performs some data preprocessing steps, including:
- Normalizing the feature data using min-max scaling
- Shuffling the dataset's rows
- Splitting the dataset into training and testing sets## Linear Regression
The program implements linear regression from scratch using gradient descent to optimize the parameters of the hypothesis function. The hypothesis function is:
h(x) = X * w + p
where X is the feature matrix, w is the weight vector, and p is the bias term.
The program initializes the weight vector w and the bias term p randomly and updates them iteratively using gradient descent until convergence or a maximum number of iterations is reached.
The program also calculates the cost (mean squared error) in every iteration to visualize the learning curve and check the convergence of the gradient descent algorithm.
## How to Run
To run this program, follow these steps:
1. Install Python 3 on your system.
2. Install the required libraries: numpy, pandas, matplotlib, and sklearn.
3. Place the "car_data.csv" file in your working directory.
4. Open a terminal or command prompt.
5. Navigate to the directory containing the program files.
6. Execute the following command:
```
python main.py
```
8. The program will display various outputs, including scatter plots, cost function plots, final parameter values, predictions, and the R^2 score for the testing set.## Customization
You can customize the program by modifying the following parameters:
- Number of iterations for gradient descent
- Learning rate values to try
- Features used for linear regression
- Ratio of training and testing setsTo make changes, edit the corresponding variables in the code.
## Contributing
Contributions are welcome! If you find any issues or have suggestions for improvement, please open an issue or submit a pull request.
## Team
- [Khaled Ashraf Hanafy Mahmoud - 20190186](https://github.com/KhaledAshrafH).
- [Noura Ashraf Abdelnaby Mansour - 20190592](https://github.com/NouraAshraff).## License
This program is licensed under the [MIT License](LICENSE.md).