Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.

Awesome Lists | Featured Topics | Projects

https://github.com/3rd-son/car-mileage-model

EDA and Model that predicts the mpg of cars
https://github.com/3rd-son/car-mileage-model

eda jupyter-notebook linear-regression machine-learning matplotlib numpy pandas python seaborn sklearn

Last synced: about 1 month ago
JSON representation

EDA and Model that predicts the mpg of cars

Host: GitHub
URL: https://github.com/3rd-son/car-mileage-model
Owner: 3rd-Son
License: mit
Created: 2023-06-14T21:55:15.000Z (over 1 year ago)
Default Branch: main
Last Pushed: 2023-06-15T14:41:23.000Z (over 1 year ago)
Last Synced: 2023-08-17T16:57:56.420Z (over 1 year ago)
Topics: eda, jupyter-notebook, linear-regression, machine-learning, matplotlib, numpy, pandas, python, seaborn, sklearn
Language: Jupyter Notebook
Homepage:
Size: 736 KB
Stars: 2
Watchers: 1
Forks: 0
Open Issues: 0
Metadata Files:
- Readme: README.md
- License: LICENSE

Awesome Lists containing this project

README

Auto-Mpg Linear Regression Model

Introduction

This repository contains the code and data for a linear regression model that predicts the miles per gallon (mpg) of a car based on various features.

Data

The dataset used for training and evaluating the model is the Auto-Mpg Dataset from the UCI Machine Learning Repository. It consists of 398 instances with 9 attributes, including the target attribute (mpg). The dataset contains both continuous and discrete features such as cylinders, displacement, horsepower, weight, acceleration, model year, origin, and car name. You find the data at MPG-Dataset

Exploratory Data Analysis (EDA)

Prior to building the model, an Exploratory Data Analysis (EDA) was performed on the dataset to gain insights and understand the relationships between the features and the target variable. Various visualizations and statistical analyses were conducted to identify patterns, correlations, and potential outliers in the data. See EDA for the Exploratory Data Analysis.

Model Training

The linear regression model was built using the scikit-learn library in Python. The features were preprocessed, including handling missing values in the "horsepower" attribute. The dataset was split into training and testing sets to evaluate the model's performance. Feature scaling and any necessary feature transformations were applied. Model.py

Model Evaluation

The trained linear regression model was evaluated using various performance metrics such as mean squared error (MSE), mean absolute error (MAE), and R-squared (coefficient of determination). The model's performance on the test set was assessed to measure its accuracy in predicting the miles per gallon of cars.

Usage

To use this model, follow the steps below:

Clone the repository: git clone https://github.com/Vic3sax/Car-Mileage-Model.git

Navigate to the project directory: cd Car-Mileage-Model

Install the required dependencies: pip install -r requirements.txt

Run the prediction script: python predict_mpg.py

Contributing

Contributions to this project are welcome. If you find any issues or have suggestions for improvements, feel free to open an issue or submit a pull request.

License

This project is licensed under the MIT License. See the LICENSE file for more details.

Acknowledgements

The Auto-Mpg Dataset used in this project is available from the UCI Machine Learning Repository. Special thanks to the original contributors and maintainers of the dataset.

Project Organization
------------

├── LICENSE
├── Makefile
├── README.md
├── data
│   ├── external
│   ├── interim
│   ├── processed
│   └── raw
│
├── docs
│
├── models
│
├── notebooks
│
│
│
├── references
│
├── reports
│   └── figures
│
├── requirements.txt
│
│
├── setup.py
├── src
│   ├── __init__.py
│ │
│   ├── data
│   │   └── make_dataset.py
│ │
│   ├── features
│   │   └── build_features.py
│ │
│   ├── models
│ │ │
│   │   ├── predict_model.py
│   │   └── train_model.py
│ │
│   └── visualization
│   └── visualize.py
│
└── tox.ini <- Makefile with commands like `make data` or `make train` <- The top-level README for developers using this project. <- Data from third party sources. <- Intermediate data that has been transformed. <- The final, canonical data sets for modeling. <- The original, immutable data dump. <- A default Sphinx project; see sphinx-doc.org for details <- Trained and serialized models, model predictions, or model summaries <- Jupyter notebooks. Naming convention is a number (for ordering), the creator's initials, and a short `-` delimited description, e.g. `1.0-jqp-initial-data-exploration`. <- Data dictionaries, manuals, and all other explanatory materials. <- Generated analysis as HTML, PDF, LaTeX, etc. <- Generated graphics and figures to be used in reporting <- The requirements file for reproducing the analysis environment, e.g. generated with `pip freeze > requirements.txt` <- makes project pip installable (pip install -e .) so src can be imported <- Source code for use in this project. <- Makes src a Python module <- Scripts to download or generate data <- Scripts to turn raw data into features for modeling <- Scripts to train models and then use trained models to make predictions <- Scripts to create exploratory and results oriented visualizations <- tox file with settings for running tox; see tox.readthedocs.io

--------

Project based on the cookiecutter data science project template. #cookiecutterdatascience