{"id":50481073,"url":"https://github.com/victor-antoniassi/junior_data_analyst_test_01","last_synced_at":"2026-06-01T17:31:43.080Z","repository":{"id":262676224,"uuid":"860096968","full_name":"victor-antoniassi/junior_data_analyst_test_01","owner":"victor-antoniassi","description":"Solution developed for a technical assessment that analyzed video game sales data to support gaming partnership decisions.","archived":false,"fork":false,"pushed_at":"2025-01-22T22:17:57.000Z","size":801,"stargazers_count":1,"open_issues_count":0,"forks_count":0,"subscribers_count":1,"default_branch":"main","last_synced_at":"2025-01-22T23:19:07.003Z","etag":null,"topics":["asses","assessment-project","data-analysis","data-analysis-project","data-analyst","duckdb","etl","prefect","python"],"latest_commit_sha":null,"homepage":"","language":"Jupyter Notebook","has_issues":false,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/victor-antoniassi.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2024-09-19T20:16:01.000Z","updated_at":"2025-01-22T22:18:01.000Z","dependencies_parsed_at":"2025-01-08T23:59:33.797Z","dependency_job_id":"aec873d4-6f6e-4008-82f2-e1bc3833b3a1","html_url":"https://github.com/victor-antoniassi/junior_data_analyst_test_01","commit_stats":null,"previous_names":["victor-antoniassi/teste_analista_dados_jr_01"],"tags_count":0,"template":false,"template_full_name":null,"purl":"pkg:github/victor-antoniassi/junior_data_analyst_test_01","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/victor-antoniassi%2Fjunior_data_analyst_test_01","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/victor-antoniassi%2Fjunior_data_analyst_test_01/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/victor-antoniassi%2Fjunior_data_analyst_test_01/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/victor-antoniassi%2Fjunior_data_analyst_test_01/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/victor-antoniassi","download_url":"https://codeload.github.com/victor-antoniassi/junior_data_analyst_test_01/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/victor-antoniassi%2Fjunior_data_analyst_test_01/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":33786896,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-05-26T15:22:16.424Z","status":"online","status_checked_at":"2026-06-01T02:00:06.963Z","response_time":115,"last_error":null,"robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":true,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["asses","assessment-project","data-analysis","data-analysis-project","data-analyst","duckdb","etl","prefect","python"],"created_at":"2026-06-01T17:31:38.668Z","updated_at":"2026-06-01T17:31:43.075Z","avatar_url":"https://github.com/victor-antoniassi.png","language":"Jupyter Notebook","funding_links":[],"categories":[],"sub_categories":[],"readme":"# Video Game Sales Analysis\n\u003e A data analysis project exploring video game sales data to evaluate potential gaming partnerships for a delivery platform.\n\n## 📊 About\nA solution developed for a Data Analyst technical challenge that processes and analyzes historical video game sales data. The project consists of an ETL pipeline that:\n- Extracts data from a CSV file hosted on Google Drive\n- Processes and cleans the data\n- Loads the results into:\n  - A DuckDB database for analysis\n  - A formatted CSV file for Power BI visualizations\n\nThe project uses Prefect for pipeline orchestration, providing:\n- Real-time execution monitoring\n- Detailed logging system\n- Visual dashboard for tracking\n- Automated workflow management\n\n### 📝 Analysis Performed\nThe project answers the following questions using SQL in Jupyter Notebook:\n1. Top 3 best-selling games in 2015\n2. Average sales volume for Xbox One in 2016\n3. Market share of Sports games compared to other genres since 2000\n4. Best-selling game in Japan during 1998\n5. **Additional Analysis**: Regional sales distribution (as percentages) for the top 25 games by global sales\n\n## 🛠️ Tech Stack\n- Python for data processing\n- DuckDB for data storage and SQL queries\n- Prefect for pipeline orchestration\n- Jupyter Notebook for analysis\n\n## 🗂️ Project Structure\n```\n├── README.md                   # Project documentation\n├── technical_challenge_proposal.md # Original challenge proposal\n├── requirements.txt           # Python dependencies\n├── data\n│   ├── BASE_DADOS.csv          # Original dataset\n│   ├── db_games_sales.db       # DuckDB database with processed data\n│   └── games_sales.csv         # Processed dataset optimized for Power BI\n├── etl\n│   └── etl.py                # ETL script\n├── notebook\n│   └── analysis.ipynb        # Analysis notebook\n```\n\n## 🚀 How to Run\n\n### Prerequisites\n1. Python 3.x\n2. pip\n\n### Installation\n1. Clone the repository:\n```bash\ngit clone https://github.com/victor-antoniassi/junior_data_analyst_test_01\ncd junior_data_analyst_test_01\n```\n\n2. Install dependencies:\n```bash\npip install -r requirements.txt\n```\n\n### Running the Pipeline\n1. Start the Prefect server:\n```bash\nprefect server start\n```\n\n\u003e **Note**: After starting the server, you'll receive a local URL to access Prefect's dashboard where you can monitor the pipeline execution in real-time.\n\n2. In another terminal, run the ETL:\n```bash\npython etl/etl.py\n```\n\nTo stop the Prefect server, use Ctrl+C in the terminal.\n\n## 📊 Data Sources\n- Original dataset available on [Google Drive](https://drive.google.com/file/d/1eoy8MlYin9PxbCjozT0kjPXPsq0RXEgY/view?usp=drive_link) (no authentication required)\n- All SQL queries and analysis are documented in the [analysis notebook](notebook/analysis.ipynb)\n\n---\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fvictor-antoniassi%2Fjunior_data_analyst_test_01","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fvictor-antoniassi%2Fjunior_data_analyst_test_01","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fvictor-antoniassi%2Fjunior_data_analyst_test_01/lists"}