{"id":28324921,"url":"https://github.com/prajjwol09/sql_retail_analysis_project","last_synced_at":"2026-02-15T13:37:28.618Z","repository":{"id":294210762,"uuid":"986260190","full_name":"Prajjwol09/SQL_Retail_Analysis_Project","owner":"Prajjwol09","description":"This project demonstrates SQL-based data cleaning, exploration, and business analysis on a retail sales dataset. It involves setting up a database, removing null values, performing EDA, and using SQL queries to extract key insights such as top customers, best-selling categories, and monthly sales trends.","archived":false,"fork":false,"pushed_at":"2025-05-19T11:06:44.000Z","size":54,"stargazers_count":0,"open_issues_count":0,"forks_count":0,"subscribers_count":1,"default_branch":"main","last_synced_at":"2025-10-07T09:35:59.052Z","etag":null,"topics":["data","data-analysis","datacleaning","dataexploration","pgadmin4","sql"],"latest_commit_sha":null,"homepage":"","language":null,"has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/Prajjwol09.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null}},"created_at":"2025-05-19T10:47:10.000Z","updated_at":"2025-05-19T15:25:55.000Z","dependencies_parsed_at":"2025-05-19T11:53:47.385Z","dependency_job_id":null,"html_url":"https://github.com/Prajjwol09/SQL_Retail_Analysis_Project","commit_stats":null,"previous_names":["prajjwol09/sql_retail_analysis_project"],"tags_count":0,"template":false,"template_full_name":null,"purl":"pkg:github/Prajjwol09/SQL_Retail_Analysis_Project","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Prajjwol09%2FSQL_Retail_Analysis_Project","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Prajjwol09%2FSQL_Retail_Analysis_Project/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Prajjwol09%2FSQL_Retail_Analysis_Project/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Prajjwol09%2FSQL_Retail_Analysis_Project/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/Prajjwol09","download_url":"https://codeload.github.com/Prajjwol09/SQL_Retail_Analysis_Project/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Prajjwol09%2FSQL_Retail_Analysis_Project/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":29480298,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-02-15T11:35:25.641Z","status":"ssl_error","status_checked_at":"2026-02-15T11:34:57.128Z","response_time":118,"last_error":"SSL_connect returned=1 errno=0 peeraddr=140.82.121.5:443 state=error: unexpected eof while reading","robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":false,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["data","data-analysis","datacleaning","dataexploration","pgadmin4","sql"],"created_at":"2025-05-25T19:13:31.611Z","updated_at":"2026-02-15T13:37:28.592Z","avatar_url":"https://github.com/Prajjwol09.png","language":null,"funding_links":[],"categories":[],"sub_categories":[],"readme":"# Retail Sales Analysis SQL Project\r\n\r\n## Project Overview\r\n\r\n**Project Title**: Retail Sales Analysis  \r\n **Database**: `sql_db_2`\r\n\r\nThis project was done to demonstrate SQL skills and techniques used by data analysts to explore, clean, and analyze retail sales data. The project involves setting up a retail sales database, performing exploratory data analysis (EDA), and answering specific business questions through SQL queries. \r\n\r\n## Objectives\r\n\r\n1. **Set up a retail sales database**: Create and populate a retail sales database with the provided sales data.\r\n2. **Data Cleaning**: Identify and remove any records with missing or null values.\r\n3. **Exploratory Data Analysis (EDA)**: Perform basic exploratory data analysis to understand the dataset.\r\n4. **Business Analysis**: Use SQL to answer specific business questions and derive insights from the sales data.\r\n\r\n## Project Structure\r\n\r\n### 1. Database Setup\r\n\r\n- **Database Creation**: The project starts by creating a database named `sql_db_2`.\r\n- **Table Creation**: A table named `Sales_tbl` is created to store the sales data. The table structure includes columns for transaction ID, sale date, sale time, customer ID, gender, age, product category, quantity sold, price per unit, cost of goods sold (COGS), and total sale amount.\r\n\r\n```sql\r\nCREATE DATABASE sql_db_2;\r\n\r\n-- CREATING TABLE\r\nCREATE TABLE Sales_tbl (\r\n\ttransactions_id  INT PRIMARY KEY,\r\n\tsale_date DATE,\r\n\tsale_time  TIME,\r\n\tcustomer_id INT,\r\n\tgender VARCHAR(20),\r\n\tage INT,\r\n\tcategory VARCHAR(20),\r\n\tquantity INT,\r\n\tprice_per_unit FLOAT,\r\n\tcogs FLOAT,\r\n\ttotal_sale FLOAT\r\n);\r\n```\r\n\r\n### 2. Data Exploration \u0026 Cleaning\r\n\r\n- **Record Count**: Determine the total number of records in the dataset.\r\n- **Customer Count**: Find out how many unique customers are in the dataset.\r\n- **Category Count**: Identify all unique product categories in the dataset.\r\n- **Null Value Check**: Check for any null values in the dataset and delete records with missing data.\r\n\r\n```sql\r\n-- DATA CLEANING\r\nSELECT COUNT(*) FROM SALES_TBL;\r\n\r\nSELECT * FROM SALES_TBL;\r\n\r\nSELECT * FROM SALES_TBL\r\nWHERE TRANSACTIONS_ID IS NULL;\r\n\r\n**CHECKING NULL VALUES**\r\nSELECT * FROM SALES_TBL\r\nWHERE \r\n\ttransactions_id IS NULL\r\nOR\r\n    sale_date IS NULL\r\nOR\r\n    sale_time IS NULL\r\nOR\r\n    customer_id IS NULL\r\nOR\r\n    gender IS NULL\r\nOR\r\n    age IS NULL\r\nOR\r\n    category IS NULL\r\nOR\r\n    quantity IS NULL\r\nOR\r\n    price_per_unit IS NULL\r\nOR\r\n    cogs IS NULL\r\nOR\r\n    total_sale IS NULL;\r\n\r\n**DELETING NULL VALUES**\r\nDELETE FROM SALES_TBL\r\nWHERE\r\ntransactions_id IS NULL\r\nOR\r\n    sale_date IS NULL\r\nOR\r\n    sale_time IS NULL\r\nOR\r\n    customer_id IS NULL\r\nOR\r\n    gender IS NULL\r\nOR\r\n    age IS NULL\r\nOR\r\n    category IS NULL\r\nOR\r\n    quantity IS NULL\r\nOR\r\n    price_per_unit IS NULL\r\nOR\r\n    cogs IS NULL\r\nOR\r\n    total_sale IS NULL;\r\n\r\n**DATA EXPLORATION**\r\n-- CHECKING SALES\r\nSELECT COUNT(*) AS total_sales from sales_tbl;\r\n\r\n-- CHECKING NUMBER OF UNIQUE CUSTOMERS\r\nSELECT COUNT(DISTINCT customer_id) AS total_customers from sales_tbl;\r\n\r\n-- CHECKING CATEGORIES\r\nSELECT DISTINCT(category) from sales_tbl;\r\n```\r\n\r\n### 3. Data Analysis \u0026 Findings\r\n\r\nThe following SQL queries were developed to answer specific business questions:\r\n```sql\r\n-- RETRIEVING ALL THE SALES MADE ON '2022-11-05'\r\nSELECT transactions_id, category FROM SALES_TBL\r\nWHERE sale_date = '2022-11-05';\r\n\r\n\r\n-- RETRIEVING TRANSACTIONS WHERE CATEGORY IS 'CLOTHING' AND QUANTITY SOLD IS MORE THAN OR EQUALS TO 4 IN THE MONTH OF NOV-2022\r\nSELECT * FROM SALES_TBL\r\nWHERE category = 'Clothing'\r\nAND quantity \u003e= 4 \r\nAND TO_CHAR(SALE_DATE, 'YYYY-MM') = '2022-11';\r\n\r\n\r\n-- CALCULATING THE TOTAL SALES FOR EACH CATEGORY\r\nSELECT category, \r\nSUM(total_sale) AS total_sales,\r\nCOUNT(*) AS total_orders FROM SALES_TBL\r\nGROUP BY category;\r\n\r\n\r\n-- FINDING AVERAGE AGE OF CUSTOMERS WHO PURCHASED FROM 'BEAUTY' CATEGORY\r\nSELECT ROUND(AVG(AGE), 2) FROM SALES_TBL\r\nWHERE category = 'Beauty';\r\n\r\n\r\n-- FINDING TRANSACTIONS WHERE TOTAL SALES IS GREATER THAN 1000\r\nSELECT * FROM SALES_TBL\r\nWHERE total_sale \u003e 1000;\r\n\r\n\r\n-- FINDING TOTAL NUMBER OF TRANSACTIONS ID MADE BY EACH GENDER IN EACH CATEGORY\r\nSELECT category, gender, COUNT(*) FROM SALES_TBL\r\nGROUP BY category, gender\r\nORDER BY category;\r\n\r\n\r\n-- CALCULATE AVERAGE SALE FOR EACH MONTH, FINDING THE BEST SELLING MONTH IN EACH YEAR.\r\nSELECT \r\n\tyear, month, avg_sale FROM \r\n(\r\nSELECT \r\nEXTRACT (YEAR FROM sale_date) as year,\r\nEXTRACT (MONTH FROM sale_date) as month,\r\nAVG(total_sale) AS avg_sale,\r\nRANK () OVER(PARTITION BY EXTRACT (YEAR FROM sale_date) ORDER BY AVG(total_sale) DESC) AS rank\r\nFROM SALES_TBL \r\nGROUP BY 1, 2\r\n) AS TBL1\r\nWHERE rank = 1;\r\n\r\n\r\n-- FINDING TOP 5 CUSTOMERS BASED ON THE HIGHEST TOTAL SALES \r\nSELECT customer_id, SUM(total_sale) as total_Sales FROM SALES_TBL\r\nGROUP BY 1\r\nORDER BY total_sales DESC\r\nLIMIT 5;\r\n\r\n\r\n-- FINDING UNIQUE CUSTOMERS WHO PURCHASED FROM EACH CATEGORY\r\nSELECT category, COUNT(DISTINCT(customer_id)) AS unique_cust FROM SALES_TBL\r\nGROUP BY 1;\r\n\r\n\r\n-- CREATING SHIFT AND FINDING THE NUMBER OF ORDERS\r\nWITH hourly_Sales\r\nAS\r\n(\r\nSELECT *,\r\n\tCASE \r\n\tWHEN EXTRACT(HOUR FROM sale_time) \u003c 12 THEN 'Morning'\r\n\tWHEN EXTRACT(HOUR FROM sale_time) BETWEEN 12 AND 17 THEN 'Afternoon'\r\n\tELSE 'Night'\r\n\tEND AS Shifts\r\n\tFROM SALES_TBL\r\n)\r\nSELECT Shifts, COUNT(*) AS total_orders FROM hourly_Sales\r\nGROUP BY Shifts;\r\n```\r\n\r\n## Findings\r\n\r\n- **Customer Demographics**: The dataset includes customers from various age groups, with sales distributed across different categories such as Clothing and Beauty.\r\n- **High-Value Transactions**: Several transactions had a total sale amount greater than 1000, indicating premium purchases.\r\n- **Sales Trends**: Monthly analysis shows variations in sales, helping identify peak seasons.\r\n- **Customer Insights**: The analysis identifies the top-spending customers and the most popular product categories.\r\n\r\n## Reports\r\n\r\n- **Sales Summary**: A detailed report summarizing total sales, customer demographics, and category performance.\r\n- **Trend Analysis**: Insights into sales trends across different months and shifts.\r\n- **Customer Insights**: Reports on top customers and unique customer counts per category.\r\n\r\n\r\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fprajjwol09%2Fsql_retail_analysis_project","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fprajjwol09%2Fsql_retail_analysis_project","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fprajjwol09%2Fsql_retail_analysis_project/lists"}