{"id":32486011,"url":"https://github.com/mathusanm6/critics-vs-players-analysis","last_synced_at":"2026-04-16T00:31:43.037Z","repository":{"id":319517826,"uuid":"1064101215","full_name":"mathusanm6/Critics-vs-Players-Analysis","owner":"mathusanm6","description":"This data analysis examines the relationship between critic scores, sales (owners), player engagement, and pricing to determine the ROI of critic reviews.","archived":false,"fork":false,"pushed_at":"2025-10-18T17:31:40.000Z","size":1062,"stargazers_count":0,"open_issues_count":0,"forks_count":0,"subscribers_count":0,"default_branch":"main","last_synced_at":"2025-10-27T05:26:28.671Z","etag":null,"topics":["data-analysis","data-science","data-visualization","game-reviews","games-sales","jupyter-notebook","python-3","steam-games"],"latest_commit_sha":null,"homepage":"","language":"Jupyter Notebook","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/mathusanm6.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null,"notice":null,"maintainers":null,"copyright":null,"agents":null,"dco":null,"cla":null}},"created_at":"2025-09-25T14:40:32.000Z","updated_at":"2025-10-18T17:31:43.000Z","dependencies_parsed_at":"2025-10-19T10:35:25.564Z","dependency_job_id":null,"html_url":"https://github.com/mathusanm6/Critics-vs-Players-Analysis","commit_stats":null,"previous_names":["mathusanm6/critics-vs-players-analysis"],"tags_count":0,"template":false,"template_full_name":null,"purl":"pkg:github/mathusanm6/Critics-vs-Players-Analysis","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/mathusanm6%2FCritics-vs-Players-Analysis","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/mathusanm6%2FCritics-vs-Players-Analysis/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/mathusanm6%2FCritics-vs-Players-Analysis/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/mathusanm6%2FCritics-vs-Players-Analysis/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/mathusanm6","download_url":"https://codeload.github.com/mathusanm6/Critics-vs-Players-Analysis/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/mathusanm6%2FCritics-vs-Players-Analysis/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":31866250,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-04-15T15:24:51.572Z","status":"ssl_error","status_checked_at":"2026-04-15T15:24:39.138Z","response_time":63,"last_error":"SSL_read: unexpected eof while reading","robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":false,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["data-analysis","data-science","data-visualization","game-reviews","games-sales","jupyter-notebook","python-3","steam-games"],"created_at":"2025-10-27T05:17:42.854Z","updated_at":"2026-04-16T00:31:42.997Z","avatar_url":"https://github.com/mathusanm6.png","language":"Jupyter Notebook","funding_links":[],"categories":[],"sub_categories":[],"readme":"# Critics vs Players: Should You Send Review Copies?\n\n\u003cdiv align=\"center\"\u003e\n\n[![Open Data Pipeline](https://img.shields.io/badge/📊_Data_Pipeline-Open_Notebook-blue?style=for-the-badge)](./game_data_pipeline.ipynb) [![Open Analysis](https://img.shields.io/badge/📈_Critics_vs_Players-Open_Notebook-green?style=for-the-badge)](./critics_vs_players.ipynb)\n\n\u003c/div\u003e\n\n**Data Pipeline**: ETL notebook that integrates IGN reviews, Steam metrics, and HowLongToBeat data into a unified dataset  \n**Critics vs Players**: Interactive analysis exploring the relationship between critic scores, sales, and player engagement\n\n---\n\n**Business Question:** As a game publisher, is it worth sending review copies to critics? Does it drive sales and engagement?\n\n**User Persona:** Thomas, game publisher about to launch a new PC (Windows) game on Steam after years of development.\n\n---\n\n## ⚠️ **Important Disclaimer: Toy Project**\n\n\u003e **This is a data science toy project for educational purposes only.**\n\u003e\n\u003e **Limitations:**\n\u003e\n\u003e - **Single critic source**: Only uses IGN reviews (not representative of all gaming critics)\n\u003e - **Platform limited**: Steam data for Windows PC games only (excludes consoles, Mac, Linux)\n\u003e - **Sample bias**: Dataset may not represent the full gaming market\n\u003e - **Not production-ready**: Do not use for actual business decisions without additional research\n\u003e\n\u003e For real business decisions, consult multiple review aggregators (Metacritic, OpenCritic), cross-platform data, and professional market research.\n\n---\n\n## Executive Summary\n\nThis analysis examines the relationship between critic scores, sales (owners), player engagement, and pricing to determine the ROI of critic reviews for PC game publishers.\n\n**Key Finding:** Higher critic scores correlate with increased ownership and player engagement, but the effect varies significantly by genre and price point.\n\n## Data Sources\n\n- [IGN Games Dataset](https://www.kaggle.com/datasets/joebeachcapital/ign-games) - Critics' ratings and reviews\n- [Steam Games Dataset](https://www.kaggle.com/datasets/fronkongames/steam-games-dataset) - Player engagement and playtime statistics\n- [HowLongToBeat](https://howlongtobeat.com/) - Game completion time data (via API)\n\n## Dataset Overview\n\n- **1,106 PC games** (2003-2016)\n- **Sources:** IGN reviews + Steam metrics + HowLongToBeat data\n- **Average critic score:** 7.51/10\n- **Average ownership:** 1.32M copies\n- **Match rate:** 61.4% between IGN and Steam catalogs\n\n## Analysis Components\n\n### 1. Data Pipeline (`game_data_pipeline.ipynb`)\n\nIntegrates three data sources into a unified dataset:\n\n- **IGN:** Professional critic scores (0-10 scale)\n- **Steam:** Sales (owners), playtime metrics, pricing\n- **HowLongToBeat:** Completion rates as engagement proxy\n\n**Pipeline metrics:**\n\n- 18,625 IGN reviews → 2,332 PC games\n- 27,075 Steam games → 1,433 matched\n- 90.7% HLTB enrichment success\n\n### 2. Business Analysis (`critics_vs_players.ipynb`)\n\nInteractive visualizations answering:\n\n- **Do higher scores = more sales?** Correlation analysis with p-values\n- **Engagement Ratio:** Actual playtime vs expected completion time\n- **Revenue Proxy:** Owners × Price as revenue indicator\n- **Completion Ratio:** Main story time / Total playtime\n- **Score Brackets:** Performance analysis across 6 score ranges (0-5, 5-6, 6-7, 7-8, 8-9, 9-10)\n\n**Key Metrics:**\n\n- **Engagement Ratio:** `median_playtime / all_styles` (\u003e1 means overplaying)\n- **Revenue Proxy:** `owners_midpoint × price` (estimated revenue)\n- **Completion Ratio:** `main_story / median_playtime` (finishing rate)\n\n## Technical Implementation\n\n### Requirements\n\n```bash\npip install -r requirements.txt\n```\n\n### Quick Start\n\n```bash\n# 1. Run data pipeline\njupyter notebook game_data_pipeline.ipynb\n\n# 2. Explore business insights\njupyter notebook critics_vs_players.ipynb\n```\n\n### Output Files\n\n- `output/games_final_*.csv` - Cleaned dataset\n- `output/quality_report_*.json` - Data quality metrics\n- `logs/` - Processing diagnostics\n\n## Visualizations\n\nThe analysis includes 8 interactive visualizations:\n\n1. **Critic Score vs Sales (Owners)** - Scatter plot with trend line showing correlation between reviews and sales\n\n   - Correlation \u003e 0.3 = Strong positive relationship\n   - Correlation \u003e 0.1 = Weak positive relationship\n   - Correlation ≤ 0.1 = No meaningful relationship\n\n2. **Critic Score vs Player Engagement** - Engagement ratio analysis (playtime vs expected completion time)\n\n   - Red line at 1.0 = Players match expected playtime\n\n3. **Critic Score vs Revenue Potential** - Revenue proxy analysis with success quadrants\n\n   - ✅ Success Zone: High score + High revenue\n   - ⚠️ Hidden Gems: Low score + High revenue\n   - ❓ Underperformers: High score + Low revenue\n\n4. **Critic Score vs Completion Commitment** - How reviews relate to game completion rates\n\n5. **Price vs Quality vs Sales** - 3D relationship colored by engagement\n\n   - Size = ownership, Color = engagement ratio\n\n6. **Performance by Score Bracket** - Bar chart with engagement overlay showing thresholds\n\n   - Shows average owners and engagement by score range\n\n7. **Critic Impact by Genre** - Faceted analysis for top 6 genres\n\n   - 🔥 STRONG impact (r \u003e 0.4)\n   - ⚡ MODERATE impact (r \u003e 0.2)\n   - ⚠️ WEAK impact (r ≤ 0.2)\n\n8. **Sales Distribution by Score** - Violin plots for risk analysis\n   - Shows median, mean, variance by score bracket\n\n## Key Insights for Publishers\n\n**Based on the analysis:**\n\n- Critic scores show measurable correlation with sales (see visualization #1 for strength by genre)\n- Score ≥7 shows significantly higher average ownership across most genres\n- Genre matters: Some genres benefit more from critic attention than others (visualization #7)\n- Risk consideration: Poor reviews can hurt more than no reviews (see score bracket performance)\n\n**ROI Considerations:**\n\n- **Costs:** Review copies + PR management + embargo coordination\n- **Benefits:** Visibility boost, sales multiplier potential, platform featuring opportunities\n- **Genre dependencies:** Impact varies significantly by genre (see faceted analysis)\n\n## 🚫 What This Project Does NOT Cover\n\n- Console market dynamics\n- Multiple review sources (Metacritic, user reviews)\n- Marketing spend impact\n- Launch timing effects\n- Regional differences\n- Early Access strategies\n\n## Methodology Notes\n\n- **Fuzzy matching:** Handles title variations (85% similarity threshold)\n- **DLC handling:** Consolidated with base games\n- **Outlier detection:** Flags MMOs with extreme playtimes\n- **Time period:** Pre-2016 (may not reflect current market)\n\n## For Game Publishers\n\nThis analysis provides directional insights but should be combined with:\n\n- Current market research\n- Platform-specific data (consoles, Epic, etc.)\n- Marketing budget considerations\n- Target audience analysis\n- Competitive landscape review\n\n## Future Improvements\n\n- [ ] Add Metacritic aggregate scores\n- [ ] Include user review sentiment\n- [ ] Analyze review timing impact (pre vs post-launch)\n- [ ] Add console data\n- [ ] Machine learning model for ROI prediction\n\n_Built to explore the publisher's dilemma: Are critic reviews worth the investment?_\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fmathusanm6%2Fcritics-vs-players-analysis","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fmathusanm6%2Fcritics-vs-players-analysis","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fmathusanm6%2Fcritics-vs-players-analysis/lists"}