An open API service indexing awesome lists of open source software.

https://github.com/v41bh4vr4jput/data-analysis-with-python

This repository is a comprehensive collection of data analysis projects and tutorials using Python's most powerful libraries: NumPy, Pandas, Seaborn, and Matplotlib. It is designed to help you explore, clean, visualize, and analyze data efficiently.
https://github.com/v41bh4vr4jput/data-analysis-with-python

api data data-analysis data-visualization matplotlib numpy pandas python sakila-db seaborn

Last synced: 2 months ago
JSON representation

This repository is a comprehensive collection of data analysis projects and tutorials using Python's most powerful libraries: NumPy, Pandas, Seaborn, and Matplotlib. It is designed to help you explore, clean, visualize, and analyze data efficiently.

Awesome Lists containing this project

README

          

# Data Analysis with Python

## 📌 Overview
This repository contains a collection of **Jupyter Notebooks** covering various aspects of **data analysis using Python**, including **data cleaning, handling missing data, visualization, and reading different file formats (CSV, Excel, SQL, HTML, etc.)**. The main libraries used in this repository include **Pandas, NumPy, Matplotlib, and Seaborn**.

## 📂 Directory Structure & File Descriptions
```
└── v41bh4vr4jput-data-analysis-with-python/
├── README.md
├── Cleaning_not_null_values.ipynb
├── Handling_missing_data.ipynb
├── Pandas_Dataframe.ipynb
├── Pandas_series.ipynb
├── Matplotlib/
│ └── Visualization.ipynb
├── Reading and Extracting data/
│ └── data/
│ ├── btc-market-price.csv
│ ├── eth-price.csv
│ ├── Reading_External_data_and_Plottng.ipynb
│ └── .ipynb_checkpoints/
│ └── btc-market-price-checkpoint.csv
├── Reading CSV and TXT files/
│ ├── btc-market-price.csv
│ ├── exam_review.csv
│ ├── Main.ipynb
│ └── out.csv
├── Reading Data from Relational databases/
│ ├── chinook.db
│ └── main.ipynb
├── Reading Excel Files/
│ ├── main.ipynb
│ ├── out.xlsx
│ └── products.xlsx
└── Reading HTML tables/
└── Main.ipynb
```

### 📝 **Notebooks & Descriptions**

#### 1️⃣ **Data Cleaning & Handling Missing Data**
- **Cleaning_not_null_values.ipynb** → Techniques for handling and cleaning data with non-null values.
- **Handling_missing_data.ipynb** → Methods for dealing with missing values in datasets using Pandas and NumPy.

#### 2️⃣ **Pandas Basics: DataFrame & Series**
- **Pandas_Dataframe.ipynb** → Introduction to Pandas DataFrames, data manipulation, and transformations.
- **Pandas_series.ipynb** → Understanding Pandas Series, operations, and indexing.

#### 3️⃣ **Data Visualization**
- **Matplotlib/Visualization.ipynb** → Creating various visualizations using **Matplotlib and Seaborn**, including bar charts, histograms, line plots, and scatter plots.

#### 4️⃣ **Reading and Extracting Data**
- **Reading_External_data_and_Plottng.ipynb** → How to read external datasets (CSV) and visualize data trends.
- **btc-market-price.csv** & **eth-price.csv** → Sample datasets for Bitcoin and Ethereum price trends.

#### 5️⃣ **Reading Different File Formats**
- **Reading CSV and TXT files/Main.ipynb** → Techniques for reading and processing CSV and TXT files.
- **Reading Data from Relational databases/main.ipynb** → Using Pandas and SQLAlchemy to extract data from **SQLite databases (chinook.db)**.
- **Reading Excel Files/main.ipynb** → Working with Excel files (**out.xlsx, products.xlsx**) using Pandas.
- **Reading HTML tables/Main.ipynb** → Extracting and parsing data from **HTML tables**.

## 🔧 **Setup & Installation**
### **Prerequisites**
Ensure you have **Python 3.8+** installed along with the following libraries:
```bash
pip install numpy pandas matplotlib seaborn jupyterlab
```

### **Run Jupyter Notebook**
Navigate to the project directory and launch Jupyter Lab:
```bash
cd v41bh4vr4jput-data-analysis-with-python
jupyter lab
```

## 🏆 **Key Features**
✅ **Comprehensive Data Handling** – Cleaning, missing data handling, and manipulation.
✅ **Data Visualization** – Plotting and analyzing trends with Matplotlib & Seaborn.
✅ **File Handling** – Read and process CSV, Excel, SQL, and HTML tables.
✅ **Real-world Data** – Work with datasets related to finance, e-commerce, and reviews.