Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.

Awesome Lists | Featured Topics | Projects

https://github.com/abeertechcamus/documentdata

The dataset was cleaned and queried using Python inside Jupyter Notebook and visualizes using PowerBI Document Data Analysis Projects
https://github.com/abeertechcamus/documentdata

dax jupyter-notebook numpy pandas powerbi python

Last synced: 8 days ago
JSON representation

The dataset was cleaned and queried using Python inside Jupyter Notebook and visualizes using PowerBI Document Data Analysis Projects

Host: GitHub
URL: https://github.com/abeertechcamus/documentdata
Owner: Abeertechcamus
Created: 2024-10-20T10:56:55.000Z (4 months ago)
Default Branch: main
Last Pushed: 2024-10-29T19:11:04.000Z (4 months ago)
Last Synced: 2024-12-19T14:49:48.608Z (2 months ago)
Topics: dax, jupyter-notebook, numpy, pandas, powerbi, python
Homepage:
Size: 1.35 MB
Stars: 0
Watchers: 1
Forks: 0
Open Issues: 0
Metadata Files:
- Readme: README.md

Awesome Lists containing this project

README

        # DocumentData

This dashboard was built using this dataset [Ordersdataset](Orders.csv).

**Data:**

The file appears to contain 20,008 rows and 19 columns

**Data Cleaning :**

Python (pandas)

**Data Visualization**

PowerBI

## overview

Here’s an overview of the data structure:

- Row 3 (index 2) has the actual header labels.

- Columns contain various details such as order date, country, city, product category, quantity, unit price, discount, and status.

- Issues include extra headers, missing values, and a lack of consistent column names.

# clean the data

 I’ll clean the data by setting the correct headers, removing empty rows, and renaming columns for clarity.

 It includes headers spread across multiple rows, and many columns are labeled "Unnamed.

### correct headers

```

import pandas as pd

df=pd.read_csv(r'Orders.csv', skiprows=4)

df

```

### Drop any completely empty rows

```

df.dropna(how='all', inplace=True)

```

### Display City names to capital titile where applicable

```

df['City']=df['City'].str.title()

```

### Remove "Tel:" from phone numbers and strip extra spaces

```

df['Phone Number']=df['Phone Number'].str.replace('Tel:','')

```

### Display a summary of the cleaned data

```

df.head(), df.info()

```

To view the dashboard enter this link  [employee dashboard](employee_dashboard.pdf).