An open API service indexing awesome lists of open source software.

https://github.com/odessaz/portfolio-projects

This is a repository I have created to showcase skills, share projects and track my progress in Data Analytics and Data Science
https://github.com/odessaz/portfolio-projects

applied-mathematics data-analysis data-science excel jupyter-notebook matplotlib-pyplot pandas portfolio python r r-studio seaborn sql statistics

Last synced: 6 months ago
JSON representation

This is a repository I have created to showcase skills, share projects and track my progress in Data Analytics and Data Science

Awesome Lists containing this project

README

          

# Data Analyst Portfolio Project Repository
## About Me:
Hi! I'm Odessa, a third-year Mathematics and Its Applications student at Toronto Metropolitan University.

This repository highlights my skills and interest in data analytics and data science. Here, you’ll find personal projects where I explore data, apply statistical methods, and build visualizations using tools like Python, R, SQL, Excel, and Tableau. I’ll continue updating this space as I grow and create more projects!

## Table of Contents
- [About Me](https://github.com/OdessaZ/PortfolioProjects/tree/main?tab=readme-ov-file#about-me)

- [Portfolio Projects](https://github.com/OdessaZ/PortfolioProjects?tab=readme-ov-file#portfolio-projects)

- Python
- [Examining Key Determinants of Movie Success](https://github.com/OdessaZ/PortfolioProjects?tab=readme-ov-file#examining-key-determinants-of-movie-success)
- [Finance Tracker](https://github.com/OdessaZ/PortfolioProjects/tree/main#2-finance-tracker)
- [Linear Regression with Python](https://github.com/OdessaZ/PortfolioProjects#4-linear-regression-with-python)
- [Best Selling Amazon Books Analysis](https://github.com/OdessaZ/Portfolio-Projects/blob/main/README.md#7-best-selling-amazon-books-analysis)
- Microsoft Excel
- [Bike Sales Data Dashboard: Interactive Excel Analysis](https://github.com/OdessaZ/PortfolioProjects#3-bike-sales-data-dashboard-interactive-excel-analysis)
- R Studio
- [COVID-19 Mortality Analysis](https://github.com/OdessaZ/Portfolio-Projects#6-covid-19-mortality-analysis)
- SQL
- Tableau
- [HR Dashboard](https://github.com/OdessaZ/PortfolioProjects#5-hr-dashboard)

- [Certificate(s)](https://github.com/OdessaZ/PortfolioProjects#certificates)

- [Education](https://github.com/OdessaZ/PortfolioProjects?tab=readme-ov-file#education)

- [Contact](https://github.com/OdessaZ/PortfolioProjects?tab=readme-ov-file#contact)

## Portfolio Projects
In this section, I will list my data analysis projects and briefly describe each one.

### 1. Examining Key Determinants of Movie Success
**Code**: [Examining Key Determinants of Movie Success](https://github.com/OdessaZ/PortfolioProjects/blob/main/Python%20Movie%20Correlation.ipynb)

**Goal**: To determine what factors contribute the most to a movie's success.

**Description**: This project explores the correlation between various movie features (e.g., budget, revenue, rating) using Python. By analyzing a dataset of movies, the project applies data-wrangling techniques, correlation analysis, and data visualization to uncover patterns and relationships between different movie attributes. The goal is to better understand how certain factors impact movie success, providing insights that could be valuable for filmmakers, marketers, and data enthusiasts.

**Skills**: Data cleaning, Data analysis, Correlation matrices, Hypothesis Testing, Data Visualization.

**Technology**: Python, Pandas, Numpy, Seaborn, Matplotlib, SciPy.

**Results**: Using Python functions, the analysis revealed that votes and budget have the highest correlation with gross earnings.

### 2. Finance Tracker
**Code**: [Finance Tracker](https://github.com/OdessaZ/PortfolioProjects/blob/main/Expense_tracker.py)

**Goal**: Get expense data from a user (via the terminal), and save and categorize each expense into an Excel spreadsheet that is automatically created and updated.

**Description**: This project is a simple finance tracker designed to help users view their finances effectively. It allows users to record and categorize their expenses while staying within their budget. This project demonstrates skills in Python programming and data handling while showcasing a practical approach to personal finance management. It highlights the potential of coding to simplify everyday tasks and improve productivity.

**Skills**: Loops, Dictionaries, Variables and Data Types, Operators, Lists, User Input, Functions, Classes, Tuples and Sets.

**Technology**: Python.

### 3. Bike Sales Data Dashboard: Interactive Excel Analysis
**Code**: [Bike Sales Data Dashboard: Interactive Excel Analysis](https://github.com/OdessaZ/PortfolioProjects/blob/main/Excel%20Project.xlsx)

**Goal**: To create an interactive dashboard using Excel that visualizes demographic data and allows users to filter and analyze trends in bike-related purchases based on factors like marital status, education, and region.

**Description**: This project demonstrates the creation of an interactive, data-driven dashboard using Microsoft Excel. It includes the integration of multiple pivot charts, slicers for filtering key demographic data, and various visual enhancements to create a user-friendly interface. The dashboard allows for detailed analysis of various data fields, such as marital status, education level, and region, with the ability to slice and dice the information for deeper insights. This project showcases proficiency in Excel data visualization and manipulation, ideal for showcasing in a portfolio to demonstrate Excel skills.

**Skills**: Data Visualization, Data Cleaning, Pivot Tables, Slicers, Report Connections, and Filtering and Analyzing Data Based on Multiple Variables.

**Technology**: Microsoft Excel

**Results**: An interactive and visually appealing dashboard that allows users to explore and analyze key data insights, such as demographic trends, purchasing behavior, and regional patterns, using pivot tables and slicers in Excel. The dashboard enables dynamic filtering and comparison across various variables, offering valuable insights for decision-making.

### 4. Linear Regression with Python
**Code**: [Linear Regression with Python](https://github.com/OdessaZ/PortfolioProjects/blob/main/Linear%20Regression%20with%20Python.ipynb)

**Goal**: To understand and implement linear regression using Python, applying statistical learning concepts to analyze data and make predictions. It involved preprocessing data, building and evaluating a regression model, and visualizing results to gain insights.

**Description**: This project focuses on implementing linear regression in Python to explore the relationship between variables and make predictions. It begins with data preprocessing using libraries like Pandas and NumPy, then building and training a regression model with scikit-learn. The project emphasizes evaluating model performance through metrics such as R-squared and Mean Squared Error, ensuring accurate predictions. Data visualizations created with Matplotlib and Seaborn provide insights into trends and the model's fit. This project demonstrates practical skills in data analysis, statistical modelling, and machine learning applications.

**Skills**: Data Visualization, Statistical Modeling, Model Development and Evaluation, and Scikit-learn.

**Technology**: Python

**Results**: The linear regression model showed a strong relationship between the variables, making accurate predictions possible. The R-squared value confirmed that the model explained the data well, and the residual analysis showed it met key assumptions. This proved that the model worked effectively for the dataset.

### 5. HR Dashboard
**Code**: [HR Dashboard](https://github.com/OdessaZ/PortfolioProjects/blob/main/HR%20Dashboard%20-%20Tableau%20Project.twbx)

**Goal**: This project aimed to create a comprehensive HR dashboard in Tableau that provides high-level insights and detailed employee records for in-depth analysis. The dashboard aims to offer HR managers a powerful tool for analyzing human resources data across three main sections: Overview, Demographics, and Income Analysis.

**Description**: This project involves the development of a comprehensive HR dashboard in Tableau designed to provide HR managers with a robust tool for analyzing human resources data. The dashboard is structured into three primary sections: Overview, Demographics, and Income Analysis. The Overview section offers insights into key HR metrics, such as employee counts and departmental distributions. The Demographics section delves into workforce composition, including gender ratios, age groups, and education levels. The Income Analysis section examines salary trends across different demographics. Additionally, the dashboard includes a detailed view of employee records that allows for filtering and in-depth analysis of individual employee data. This interactive platform enables HR managers to gain high-level insights and drill down into specific details, facilitating informed decision-making and strategic workforce management.

**Skills**: Data Visualization, Data Collection, and Data Presentation

**Technology**: Tableau

### 6. COVID-19 Mortality Analysis
**Code**: [COVID-19 Mortality Analysis with R](https://github.com/OdessaZ/Portfolio-Projects/blob/main/script.R)

**Data**: [COVID-19 Data](https://github.com/OdessaZ/Portfolio-Projects/blob/main/COVID19_line_list_data.csv)

**Goal**: To explore the relationship between age, gender, and COVID-19 mortality using real-world data. The goal was to clean and prepare the data, perform statistical tests, and interpret the results to draw meaningful conclusions about health outcomes.

**Description**: This project focuses on analyzing a COVID-19 dataset using R to identify patterns in mortality based on age and gender. The workflow includes importing and cleaning the data, creating a binary variable to represent the death outcomes, and conducting exploratory analysis. Statistical methods such as t-tests were used to compare death rates between age groups and genders, and assess whether the differences were statistically significant. Summary statistics and hypothesis testing were applied to interpret mortality trends and draw conclusions. This project highlights practical skills in data cleaning, hypothesis testing, and interpretation of real-world health data.

**Skills**: Statistical Analysis, Data Cleaning, Hypothesis Testing, Data Interpretation

**Technology**: R Studio

**Results**: The analysis revealed that older individuals had a significantly higher death rate compared to younger individuals, and men had a higher mortality rate than women. The t-tests confirmed that both age and gender differences were statistically significant, supporting the need for targeted public health responses.

### 7. Best Selling Amazon Books Analysis
**Code**: [Amazon Best Sellers Analysis](https://github.com/OdessaZ/Portfolio-Projects/blob/main/Amazon-Best-Sellers-Analysis.ipynb)

**Data**: [Bestsellers.csv](https://github.com/OdessaZ/Portfolio-Projects/blob/main/bestsellers.csv)

**Goal**: To analyze trends in Amazon’s best-selling books using Python and pandas library. The goal was to clean and explore the dataset, uncover insights about authors, publishers, and book prices, and visualize patterns in book sales.

**Description**: This project uses Python and pandas to analyze a dataset of the top 50 best-selling books on Amazon. The workflow involved importing and cleaning the data, identifying the most frequent authors and publishers, and exploring trends in price distribution and user ratings. The analysis included visualizing results to better understand which types of books dominate sales and how pricing and ratings relate to popularity. this project showcases essential data analysis techniques and provides insights into book sales trends on a major retail platform.

**Skills**: Data Cleaning, Exploratory Data Analysis, Data Visualization, Trend Analysis

**Technology**: Python, Pandas, Jupyter Notebook

**Results**: The analysis revealed that certain authors appear more frequently among bestsellers, prices are generally clustered within a narrow range, and higher-rated books are not always the most expensive. these insights demonstrate consumer preferences and offer valuable perspectives for publishers and sellers.

## Certificates
[Google Data Analytics Professional Certificate](https://coursera.org/share/b7690839c5dcd23ff5f7f26a751b28ed) - February 2025

## Education
Toronto Metropolitan University: Bachelor's Degree, Mathematics and Its Applications, 2022

## Contact
- LinkedIn: [@odessazang](https://www.linkedin.com/in/odessa-zang/)
- Email: [odessazang@gmail.com](mailto:odessazang@gmail.com)