{"id":50862881,"url":"https://github.com/soufianboukir/ecom-analytics-platform","last_synced_at":"2026-06-14T22:31:19.723Z","repository":{"id":357131422,"uuid":"1226725457","full_name":"soufianboukir/ecom-analytics-platform","owner":"soufianboukir","description":"End-to-end data science project on an Amazon sales dataset, including data preprocessing, analysis, modeling, and a Streamlit dashboard for insights and decision-making.","archived":false,"fork":false,"pushed_at":"2026-05-11T12:23:56.000Z","size":12751,"stargazers_count":1,"open_issues_count":0,"forks_count":0,"subscribers_count":0,"default_branch":"main","last_synced_at":"2026-05-11T14:22:15.377Z","etag":null,"topics":["data-analysis","data-science","data-visualization","data-visualization-dashboard","forecasting-models","timeseries"],"latest_commit_sha":null,"homepage":"https://ecom-analytics-forecasting-platform.streamlit.app/","language":"Jupyter Notebook","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"apache-2.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/soufianboukir.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null,"notice":null,"maintainers":null,"copyright":null,"agents":null,"dco":null,"cla":null}},"created_at":"2026-05-01T18:59:47.000Z","updated_at":"2026-05-11T12:24:01.000Z","dependencies_parsed_at":null,"dependency_job_id":null,"html_url":"https://github.com/soufianboukir/ecom-analytics-platform","commit_stats":null,"previous_names":["soufianboukir/ecom-analytics-platform"],"tags_count":null,"template":false,"template_full_name":null,"purl":"pkg:github/soufianboukir/ecom-analytics-platform","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/soufianboukir%2Fecom-analytics-platform","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/soufianboukir%2Fecom-analytics-platform/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/soufianboukir%2Fecom-analytics-platform/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/soufianboukir%2Fecom-analytics-platform/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/soufianboukir","download_url":"https://codeload.github.com/soufianboukir/ecom-analytics-platform/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/soufianboukir%2Fecom-analytics-platform/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":34340780,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-05-26T15:22:16.424Z","status":"online","status_checked_at":"2026-06-14T02:00:07.365Z","response_time":62,"last_error":null,"robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":true,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["data-analysis","data-science","data-visualization","data-visualization-dashboard","forecasting-models","timeseries"],"created_at":"2026-06-14T22:31:19.169Z","updated_at":"2026-06-14T22:31:19.714Z","avatar_url":"https://github.com/soufianboukir.png","language":"Jupyter Notebook","funding_links":[],"categories":[],"sub_categories":[],"readme":"# E-Commerce Analytics \u0026 Forecasting Platform\n\nAn end-to-end data science and business intelligence system built on 100,000 Amazon-style sales transactions — combining interactive dashboards, customer analytics, product performance analysis, and multi-model revenue forecasting.\n\n## 📄 Report\n- Full analysis report: [PDF](https://github.com/soufianboukir/ecom-analytics-platform/blob/main/reports/customer-analystics-and-revenue-forecasting-system.pdf)\n- Streamlit dashboard: [Live App](https://ecom-analytics-forecasting-platform.streamlit.app/)\n\n\u003cimg width=\"2048\" height=\"1032\" alt=\"image\" src=\"https://github.com/user-attachments/assets/a8200779-4919-4896-9408-4b345627051e\" /\u003e\n\n\n## Overview\n\nThis project develops a **5-page interactive Streamlit dashboard** that transforms raw e-commerce transactional data into actionable business intelligence. It covers the full data science pipeline:\n\n- **Exploratory Data Analysis** — distributions, correlations, trends\n- **Customer Analytics** — lifetime value, geographic distribution, payment behavior\n- **Product Performance** — revenue, margins, brand comparison, drilldown\n- **Revenue Forecasting** — 4 models benchmarked, XGBoost selected for 12-month forecast\n- **Return Analysis** — country × category heatmap, top returned products\n\nThe system is designed to answer real business questions:\n- Which customers, products, and geographies drive the most revenue?\n- Do discounts actually increase revenue?\n- What will revenue look like over the next 12 months?\n- Which products are being returned most and why?\n\n\n---\n\n## Project Structure\n\n```\necom-analytics-platform/\n│\n├── app/\n│   ├── app.py                        # Main Streamlit entry point\n│   └── pages/\n│       ├── overview.py   # Page 1 — KPIs, revenue trend, top countries\n│       ├── analysis.py       # Page 2 — Revenue by category, discounts, shipping\n│       ├── customer_insights.py    # Page 3 — LTV, geography, payment methods\n│       ├── product_performance.py  # Page 4 — Products, margins, brands, drilldown\n│       └── forecasting.py          # Page 5 — Multi-model forecasting system\n│\n├── data/\n│   ├── raw/\n│   │   └── amazon_sales.csv          # Original dataset\n│   └── processed/\n│       ├── amazon_sales_final.csv    # Cleaned \u0026 feature-engineered dataset\n│       └── amazon_sales.py\n│\n│\n├── notebooks/\n│   ├── 01_data_cleaning.ipynb\n│   ├── 02_exploratory_data_analysis.ipynb \n│   └── 03_feature_engineering.ipynb  \n│\n├── requirements.txt                  # Python dependencies\n├── README.md                 \n└── report/\n    └── main.pdf                      # Full academic report\n```\n\n---\n\n## Dashboard Pages\n\n### Page 1 — Executive Overview\nHigh-level business snapshot with 4 KPI cards, monthly revenue trend, top 5 countries by revenue, top 5 categories donut chart, and order status breakdown.\n\n| KPI | Value |\n|---|---|\n| Total Revenue | $91,825,648 |\n| Total Orders | 100,000 |\n| Avg Order Value | $918.26 |\n| Return Rate | 6.2% |\n\n---\n\n### Page 2 — Sales Analysis\n- Revenue breakdown by Category, Brand, and Payment Method\n- Discount vs Revenue scatter analysis\n- Shipping cost distribution by country (box plots)\n- Seasonal revenue patterns\n\n---\n\n### Page 3 — Customer Insights\n- Top 20 customers by Lifetime Value (LTV) — horizontal gradient bar chart\n- Customer geographic distribution — US choropleth map + city bar chart\n- Average order value by payment method — grouped bar + revenue share donut\n- Export buttons for LTV and payment summary CSVs\n\n---\n\n### Page 4 — Product Performance\n- Best-selling products — Top 20 by revenue and by quantity (tabbed)\n- Category margin analysis — Revenue vs Shipping vs Tax vs Margin grouped bar\n- Brand bubble chart — Avg unit price vs Avg quantity (bubble size = revenue)\n- Product drilldown — search bar → KPIs + monthly sparkline + raw orders table\n- Return analysis — By country, by category, country × category heatmap, top 10 returned products\n\n---\n\n### Page 5 — Revenue Forecasting System\n- 4 models: Naive baseline, Linear Regression, XGBoost, Prophet\n- Time-based train/test split — last 6 months as holdout\n- Actual vs Predicted chart (per model selector)\n- All models comparison chart\n- Model performance table — MAE, RMSE, R², MAPE\n- XGBoost 12-month recursive future forecast with downloadable CSV\n\n---\n\n## Installation\n\n### 1. Clone the repository\n```bash\ngit clone https://github.com/soufianboukir/ecom-analytics-platform.git\ncd ecom-analytics-platform\n```\n\n### 2. Create a virtual environment\n```bash\npython -m venv .venv\nsource .venv/bin/activate        # Linux / macOS\n.venv\\Scripts\\activate           # Windows\n```\n\n### 3. Install dependencies\n```bash\npip install -r requirements.txt\n```\n\n---\n\n## Usage\n\n```bash\nstreamlit run app/app.py\n```\n\nThen open your browser at `http://localhost:8501`\n\n### Sidebar Filters\n- **Date Range** — Filter all pages by order date\n- **Category** — Filter by product category (available on relevant pages)\n\nbuilt with ❤️ by **soufian**.\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fsoufianboukir%2Fecom-analytics-platform","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fsoufianboukir%2Fecom-analytics-platform","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fsoufianboukir%2Fecom-analytics-platform/lists"}