https://github.com/245839/automobile-analysis
Analysis of data on imported cars to the USA performed in Python using libraries for data analysis in the Jupyter environment.
https://github.com/245839/automobile-analysis
data-analysis jupyter-notebook python
Last synced: about 1 month ago
JSON representation
Analysis of data on imported cars to the USA performed in Python using libraries for data analysis in the Jupyter environment.
- Host: GitHub
- URL: https://github.com/245839/automobile-analysis
- Owner: 245839
- License: mit
- Created: 2025-03-15T16:59:40.000Z (over 1 year ago)
- Default Branch: main
- Last Pushed: 2025-03-15T17:14:12.000Z (over 1 year ago)
- Last Synced: 2025-03-15T18:19:16.721Z (over 1 year ago)
- Topics: data-analysis, jupyter-notebook, python
- Language: Jupyter Notebook
- Homepage:
- Size: 613 KB
- Stars: 0
- Watchers: 1
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
# Automobile-Analysis
This project involves **data analysis of automobiles**, focusing on vehicle specifications, insurance risk ratings, and normalized loss values. The goal is to explore trends, identify patterns, and extract meaningful insights from the dataset.
## Dataset Overview
The dataset consists of three main types of information:
- **Vehicle Specifications** – Various characteristics such as make, body style, engine type, fuel system, horsepower, and more.
- **Insurance Risk Rating** – A "symboling" score indicating the vehicle's risk level.
- **Normalized Losses** – The relative average insurance loss per vehicle per year, adjusted for different car categories (e.g., sports cars, station wagons).
## Project Objectives & Methodology
- **Data Cleaning & Preprocessing** – Handling missing values and standardizing formats.
- **Feature Engineering** – Splitting features into **numerical** and **categorical** variables.
- **Data Visualization** – Creating visual representations of key dataset insights.
- **Exploratory Data Analysis (EDA)** – Investigating distributions, correlations, and patterns.
- **Predictive Modeling** – Implementing two machine learning models:
-- Linear Regression
-- Random Forest
- **Hyperparameter Optimization** – Fine-tuning RF model to enhance performance.
## How to View & Run the Project
To run this project, you will need:
- **Python 3.x** with required libraries installed.
- **Jupyter Notebook** (Recommended) or another editor that supports `.ipynb` files.