{"id":15442728,"url":"https://github.com/hariniavula/beats-data-analysis","last_synced_at":"2025-08-30T17:11:04.779Z","repository":{"id":250344525,"uuid":"824705055","full_name":"hariniavula/beats-data-analysis","owner":"hariniavula","description":"Cleaning, gathering, and analyzing headphone review data from Amazon","archived":false,"fork":false,"pushed_at":"2024-07-26T16:33:40.000Z","size":2823,"stargazers_count":0,"open_issues_count":0,"forks_count":0,"subscribers_count":1,"default_branch":"main","last_synced_at":"2024-10-18T13:15:07.251Z","etag":null,"topics":["google-colab","python-3"],"latest_commit_sha":null,"homepage":"","language":"Jupyter Notebook","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/hariniavula.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2024-07-05T18:27:23.000Z","updated_at":"2024-07-26T16:33:43.000Z","dependencies_parsed_at":"2024-07-26T18:21:28.460Z","dependency_job_id":"860e0f9b-f11f-4ff9-a71a-2dccee67507a","html_url":"https://github.com/hariniavula/beats-data-analysis","commit_stats":null,"previous_names":["hariniavula/beats-data-analysis"],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/hariniavula%2Fbeats-data-analysis","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/hariniavula%2Fbeats-data-analysis/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/hariniavula%2Fbeats-data-analysis/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/hariniavula%2Fbeats-data-analysis/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/hariniavula","download_url":"https://codeload.github.com/hariniavula/beats-data-analysis/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":245986658,"owners_count":20705242,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["google-colab","python-3"],"created_at":"2024-10-01T19:29:43.387Z","updated_at":"2025-03-28T07:25:45.112Z","avatar_url":"https://github.com/hariniavula.png","language":"Jupyter Notebook","funding_links":[],"categories":[],"sub_categories":[],"readme":"# beats-data-analysis\n\n## Project Overview\nThis project was part of an internship with Extern and Beats by Dr.Dre. I analyzed customer reviews for various headphone products, including Beats Studio 3, Apple Airpods Max, Bose QuietComfort, Sennheiser, Sony M5, BERIBES, JBL, Soundcore Anker, Beats Studio Pro, and Beats Solo 4. The goal was to gain insights into consumer satisfaction and preferences for Beats by Dr. Dre products by leveraging data gathering, cleaning, analysis, and natural language processing.\n\n\n## Final Comprehensive Report  \n- file: `Final Report.ipynb`\n- Explanation of process, comprehensive analysis, and final insights derived from the project. \n\n## Data Collection \n- file: `gathering_data.ipynb`\n- The reviews were gathered using the *Oxylabs API*, which provided access to Amazon reviews for the specified headphone products. The API allowed us to efficiently collect a large volume of reviews for each product, ensuring a comprehensive dataset for analysis.\n- included one of the json files obtained from the API: `data (4).json`\n- raw data file: `Reviews-update.csv`\n\n## Data Cleaning and Preprocessing \n- file: `data_cleaning.ipynb`\n- The collected data underwent a thorough cleaning process, which included:\n- Handling missing values by filling them with appropriate placeholders or removing incomplete entries.\n- Standardizing the format of key columns, such as `timestamp` and `rating`.\n- cleaned data file: `cleaned_data.csv`\n\n## Dataset\nThe dataset includes the following key columns:\n- `title`: The title and preview of the review.\n- `author`: The author of the review.\n- `rating`: The rating given by the reviewer.\n- `content`: The content of the review.\n- `timestamp`: The date and time the review was posted.\n- `profile_id`: The ID of the reviewer's profile.\n- `is_verified`: Whether the review is from a verified purchase.\n- `helpful_count`: The number of helpful votes the review received.\n- `product_attributes`: Additional attributes of the product.\n\n## Data Analysis and Visualization\n- file: `EDA_and_analysis.ipynb`\n- Descriptive statistics, data visualization, correlation analysis was performed on various variables and subsets. \n- Sentiment analysis was performed on the review texts using the VADER sentiment analyzer from the NLTK library. This helped classify reviews into positive, negative, and neutral sentiments, providing deeper insights into consumer opinions.\n- Conclusion of findings/insights at the bottom. \n\n## Analysis Using Gemini AI\n- file: `GeminiAI_Final_Insights.ipynb`\n- We utilized the *Gemini AI* model to analyze and summarize the key points from the reviews. The model helped in generating comprehensive insights for each subset of reviews.\n- API Key: https://ai.google.dev/gemini-api/docs/api-key\n\n## Installation\nTo run this project, ensure you have the following dependencies installed:\n- pandas\n- numpy\n- seaborn\n- matplotlib\n- nltk\n- google-generativeai\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fhariniavula%2Fbeats-data-analysis","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fhariniavula%2Fbeats-data-analysis","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fhariniavula%2Fbeats-data-analysis/lists"}