{"id":31579015,"url":"https://github.com/samyomb/olist-ecommerce-analytics","last_synced_at":"2025-10-05T20:45:08.583Z","repository":{"id":317887176,"uuid":"1069218895","full_name":"Samyomb/Olist-Ecommerce-analytics","owner":"Samyomb","description":"Olist e-commerce performance \u0026 customer reviews — Python cleaning + BigQuery SQL + Looker Studio dashboard (2017 FY \u0026 2018 YTD) with actionable insights","archived":false,"fork":false,"pushed_at":"2025-10-03T17:18:37.000Z","size":1900,"stargazers_count":0,"open_issues_count":0,"forks_count":0,"subscribers_count":0,"default_branch":"main","last_synced_at":"2025-10-03T18:34:54.572Z","etag":null,"topics":["analytics","bigquery","brasil","customer-experience","dashboard","data-visualization","e-commerce","looker-studio","olist","python","review","sql"],"latest_commit_sha":null,"homepage":"https://lookerstudio.google.com/u/0/reporting/9bf77bf9-0da9-4c79-8745-28d9893c480a","language":"Jupyter Notebook","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/Samyomb.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null,"notice":null,"maintainers":null,"copyright":null,"agents":null,"dco":null,"cla":null}},"created_at":"2025-10-03T15:28:26.000Z","updated_at":"2025-10-03T17:18:40.000Z","dependencies_parsed_at":"2025-10-03T18:34:59.958Z","dependency_job_id":"32ded297-d8f8-4cd6-a71c-d04a05240f29","html_url":"https://github.com/Samyomb/Olist-Ecommerce-analytics","commit_stats":null,"previous_names":["samyomb/olist-ecommerce-analytics"],"tags_count":null,"template":false,"template_full_name":null,"purl":"pkg:github/Samyomb/Olist-Ecommerce-analytics","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Samyomb%2FOlist-Ecommerce-analytics","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Samyomb%2FOlist-Ecommerce-analytics/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Samyomb%2FOlist-Ecommerce-analytics/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Samyomb%2FOlist-Ecommerce-analytics/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/Samyomb","download_url":"https://codeload.github.com/Samyomb/Olist-Ecommerce-analytics/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Samyomb%2FOlist-Ecommerce-analytics/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":278517840,"owners_count":26000173,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","status":"online","status_checked_at":"2025-10-05T02:00:06.059Z","response_time":54,"last_error":null,"robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":true,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["analytics","bigquery","brasil","customer-experience","dashboard","data-visualization","e-commerce","looker-studio","olist","python","review","sql"],"created_at":"2025-10-05T20:45:06.645Z","updated_at":"2025-10-05T20:45:08.575Z","avatar_url":"https://github.com/Samyomb.png","language":"Jupyter Notebook","funding_links":[],"categories":[],"sub_categories":[],"readme":"# Olist E-commerce analytics\nOlist e-commerce performance \u0026amp; customer reviews — Python cleaning + BigQuery SQL + Looker Studio dashboard (2017 FY \u0026amp; 2018 YTD) with actionable insights\n\nLive dashboard:\nhttps://lookerstudio.google.com/reporting/9bf77bf9-0da9-4c79-8745-28d9893c480a\n\n## Results\n\n2017 (full year): GMV R$ 7.02M, 45.1K orders, AOV R$ 155.7, Canceled 0.59%, Avg delivery 13 days, Late rate 6.4%, 44,573 reviews (14.7% negative).\n\n2018 YTD (Jan–Aug vs Jan–Aug 2017): GMV R$ 8.55M (+140.9%), Orders 54.0K (+135.1%), AOV R$ 158.4 (+2.5%), Canceled 0.58% (−0.16 pp), Late rate 9.2% (↑ vs 4.0% PY).\n\nTop negative reason: Delivery delay (~49–52% of negatives).\n\nHighest severity reason: Wrong / Not as described (~75–80% rated 1–2/5).\n\nImpact plan (next 30–60 days): cut late rate −1 pp in hotspot states (AL/MA/CE), reduce negative reviews −1 pp (delivery-delay share −5 pp), grow AOV +1–2% on high-ticket categories, keep canceled ≤ 0.6%.\n\n## Why I built this\n\nUnderstand Olist’s sales performance, category mix, and customer pain points, then turn findings into an actionable plan that Ops, CX and Growth can execute.\n\n## Stack\n\nPython - cleaning (Drop isna, duplicated, columns)\n\nBigQuery – SQL table + view final_data_enriched\n\nLooker Studio – interactive 3-page report (report-level filters)\n\nSQL – CTEs, date logic, KPI calculations\n\n## Data model \n\nOne row per order with:\n\ngmv = price + freight\n\nproduct_category \u0026 customer_state (cleaned)\n\norder_purchase_timestamp, order_delivered_customer_date, order_estimated_delivery_date\n\ndelivery_days = date diff (delivered − purchase)\n\nis_late = delivered \u003e estimated\n\nReviews analyzed by score bands (1–2 / 3 / 4–5) and reason (grouped)\n\n## KPI definitions (BigQuery-style)\n\n### -- GMV\nSUM(oi.price + oi.freight_value) AS gmv\n\n### -- Orders\nCOUNT(DISTINCT o.order_id) AS orders_total\n\n### -- AOV\nSAFE_DIVIDE(SUM(oi.price + oi.freight_value), COUNT(DISTINCT o.order_id)) AS aov\n\n### -- Canceled %\nSAFE_DIVIDE(COUNTIF(o.order_status = 'canceled'),\n            COUNT(DISTINCT o.order_id)) AS canceled_rate\n\n### -- Late rate\nSAFE_DIVIDE(COUNTIF(order_delivered_customer_date \u003e order_estimated_delivery_date),\n            COUNT(DISTINCT o.order_id)) AS late_rate\n\n### -- Avg delivery time (days)\nAVG(DATE_DIFF(order_delivered_customer_date, order_purchase_timestamp, DAY)) AS avg_delivery_days\n\n### -- Negative reviews %\nSAFE_DIVIDE(COUNTIF(review_score \u003c= 2), COUNT(*)) AS negative_reviews_pct\n\n## What’s inside the report\n\n### Page 1 – Sales Performance\nKPI cards (GMV, Orders, AOV, Canceled %, Avg delivery days, Late rate), GMV \u0026 Orders trend (combo), Top-10 states, Late rate by state.\n\n\n![Page 1 - Sales Dashboard](assets/screenshots/sales_dashboard_2017.png)(https://lookerstudio.google.com/u/0/reporting/9bf77bf9-0da9-4c79-8745-28d9893c480a/page/iFgZF)\n\n\n### Page 2 – Category Performance\nKPI cards, Growth vs PY by category (GMV share \u0026 Δ%), Category × State heatmap, Top-5 GMV by category, Top-5 AOV by category.\n\n\n![Page 2 - Category Heatmap](assets/screenshots/category_2017.png)(https://lookerstudio.google.com/u/0/reporting/9bf77bf9-0da9-4c79-8745-28d9893c480a/page/p_zv98sabpwd)\n\n\n### Page 3 – Customer Reviews \u0026 Reasons\n100% stacked review trend (High/Medium/Low), Severity × Prevalence bubble chart, table with Share of all reviews, % negative within reason, % of total negatives.\n\n\n![Page 3 - Reviews](assets/screenshots/reviews_2017.png)(https://lookerstudio.google.com/u/0/reporting/9bf77bf9-0da9-4c79-8745-28d9893c480a/page/p_iy8wkjepwd)\n\n\n\nFilters (report-level, persistent across pages): Date · State · Category.\nButton Clear all resets the whole report.\n\n## Key insights (data-driven)\n\n2018 is scaling fast (like-for-like): GMV +140.9%, Orders +135.1%, AOV +2.5%.\n\nRisk: on-time delivery – Late rate up to 9.2% (from 4.0% PY Jan–Aug); hotspots: Alagoas → Maranhão → Ceará.\n\nCustomer voice – Delivery delay drives about half of all negative reviews; Wrong/Not as described has very high severity.\n\nGrowth levers – High-ticket Computers (AOV ~R$1.2–1.3K) and selected Appliances; GMV leaders Bedding \u0026 Bath, Health \u0026 Beauty, Sports \u0026 Leisure.\n\nRegional play – SP/RJ/MG support premium AOV; target local bundles \u0026 financing.\n\n## Actions \u0026 expected impact\n\nOps – Reduce late deliveries (AL/MA/CE): tighten carrier SLAs, add fallback carrier, daily on-time dashboard \u0026 alerts, clearer cut-offs, first-mile controls, proactive ETA comms (auto SMS/e-mail when ETA \u003e SLA with new ETA + goodwill).\nCX – Fix “Not as described”: double-scan at packout + photo proof; clarify PDP titles/variants/photos.\nGrowth – Monetize high-AOV \u0026 protect volume: bundles + financing for Computers/Appliances (geo-target SP/RJ/MG); for Bedding \u0026 Bath / Health \u0026 Beauty ensure stock health, price monitoring, cross-sell.\n\n## Success metrics (60 days):\n\nLate rate −1 pp (AL/MA/CE)\n\nNegative reviews −1 pp (delivery-delay share −5 pp)\n\nAOV +1–2% on high-ticket categories\n\nCanceled ≤ 0.6%\n\n## How to run\n\nBigQuery\n\nBuild a view/table final_data_enriched with fields above and the KPI logic.\n\nEnsure date fields are DATE/TIMESTAMP and categories/states are cleaned.\n\nLooker Studio\n\nConnect the table \u0026 view as a data source.\n\nUse report-level controls (Date, State, Category).\n\nFor YoY on charts, set Comparison date range = Previous year.\n\n## Limitations\n\n2016 partial (Sep–Dec). 2018 data available Jan–Aug only; YoY comparisons use Jan–Aug vs Jan–Aug.\n\nReview reasons are grouped; linking every review to an order may vary by dataset completeness.\n\n## Contact\n\nSamy Bouhassoune – Data Analyst\nLinkedIn: https://www.linkedin.com/in/samy-bouhassoune · Email: samyy.b@hotmail.fr\n\n## 🇫🇷 Résumé\n\n2017 : R$ 7.02M GMV, 45.1K commandes, AOV R$ 155.7, Annulés 0.59%, Livraison 13 j, Retards 6.4%, 44,573 avis (14.7% négatifs).\n\n2018 YTD (jan–août vs N-1) : GMV +140.9%, Commandes +135.1%, AOV +2.5%, Annulés 0.58% (−0.16 pt), Retards 9.2%.\n\nPriorités : baisser les retards (AL/MA/CE), corriger “non conforme”, pousser les catégories à fort AOV, protéger les best-sellers.\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fsamyomb%2Folist-ecommerce-analytics","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fsamyomb%2Folist-ecommerce-analytics","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fsamyomb%2Folist-ecommerce-analytics/lists"}