{"id":18824348,"url":"https://github.com/rijul007/diamonds-analysis-using-r","last_synced_at":"2026-01-25T13:05:49.832Z","repository":{"id":257239293,"uuid":"853334090","full_name":"rijul007/Diamonds-Analysis-using-R","owner":"rijul007","description":"Diamonds data analysis using R, exploring relationships between diamond attributes (such as carat, cut, color, and clarity) and price, with a focus on providing insights for engagement ring selection through various statistical techniques and data visualizations including histograms, boxplots, scatter plots, and bar charts.","archived":false,"fork":false,"pushed_at":"2024-09-06T13:00:05.000Z","size":5,"stargazers_count":0,"open_issues_count":0,"forks_count":0,"subscribers_count":1,"default_branch":"master","last_synced_at":"2025-05-22T11:33:54.684Z","etag":null,"topics":["data-analysis","data-science"],"latest_commit_sha":null,"homepage":"https://rpubs.com/Rijul-Grover/1217027","language":null,"has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/rijul007.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2024-09-06T13:00:01.000Z","updated_at":"2024-09-15T10:42:33.000Z","dependencies_parsed_at":"2024-09-15T13:36:29.769Z","dependency_job_id":"979e3be5-9087-498e-a8ae-28fb94c69dd6","html_url":"https://github.com/rijul007/Diamonds-Analysis-using-R","commit_stats":null,"previous_names":["rijul007/diamonds-analysis-using-r"],"tags_count":0,"template":false,"template_full_name":null,"purl":"pkg:github/rijul007/Diamonds-Analysis-using-R","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/rijul007%2FDiamonds-Analysis-using-R","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/rijul007%2FDiamonds-Analysis-using-R/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/rijul007%2FDiamonds-Analysis-using-R/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/rijul007%2FDiamonds-Analysis-using-R/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/rijul007","download_url":"https://codeload.github.com/rijul007/Diamonds-Analysis-using-R/tar.gz/refs/heads/master","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/rijul007%2FDiamonds-Analysis-using-R/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":28753412,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-01-25T10:25:12.305Z","status":"ssl_error","status_checked_at":"2026-01-25T10:25:11.933Z","response_time":113,"last_error":"SSL_connect returned=1 errno=0 peeraddr=140.82.121.5:443 state=error: unexpected eof while reading","robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":false,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["data-analysis","data-science"],"created_at":"2024-11-08T00:56:14.623Z","updated_at":"2026-01-25T13:05:49.813Z","avatar_url":"https://github.com/rijul007.png","language":null,"funding_links":[],"categories":[],"sub_categories":[],"readme":"# Diamond Analysis Project\n\n## Objective\nTo analyze the characteristics and pricing of diamonds using the built-in `diamonds` dataset from the `ggplot2` library in R. The project aims to answer specific questions about diamond attributes and their relationship to price, particularly focusing on engagement ring selection.\n\n## Dataset Used\nThe analysis uses the `diamonds` dataset from the `ggplot2` library in R. This dataset contains information on 53,940 round-cut diamonds, including attributes such as carat weight, cut quality, color, clarity, and price.\n\n## Analysis Techniques\n\nThe project employs various data analysis and visualization techniques using R:\n\n1. **Data Loading and Exploration**: \n   - Utilized **`tidyverse`** and **`dplyr`** for data manipulation\n   - Used **`glimpse()`** function to review data structure\n\n2. **Data Visualization**:\n   - Employed **`ggplot2`** for creating various types of plots:\n     - Histograms and boxplots for price distribution\n     - Boxplots for comparing table values across cut grades\n     - Pie charts for clarity segment analysis\n     - Scatter plots with trend lines for price-carat correlation\n     - Bar charts for average price analysis\n   - Used **`patchwork`** library for combining multiple plots\n\n3. **Statistical Analysis**:\n   - Calculated percentages and averages using **`dplyr`** functions like `group_by()`, `summarise()`, and `mutate()`\n   - Applied sampling techniques using **`sample_n()`** function\n\n4. **Data Transformation**:\n   - Created new segments for clarity using **`mutate()`** and conditional statements\n   - Filtered data for specific analyses using **`filter()`** function\n\n5. **Advanced Visualization Techniques**:\n   - Implemented faceting with **`facet_wrap()`** for multi-panel plots\n   - Used **`geom_smooth()`** for adding trend lines to scatter plots\n   - Applied custom color palettes and themes for enhanced aesthetics\n\n6. **Scale and Axis Manipulation**:\n   - Utilized **`scale_x_continuous()`** and **`scale_y_continuous()`** for customizing axis scales\n   - Implemented **`scale_fill_manual()`** and **`scale_color_manual()`** for custom color schemes\n\n## Results\n\n1. The distribution of diamond prices is positively skewed with outliers.\n2. Ideal-cut diamonds have a narrower range of table values compared to other cut grades.\n3. Only 3.3% of the diamonds in the dataset are categorized as Internally Flawless (IF).\n4. There's a positive correlation between carat weight and price for colorless diamonds (D, E, F colors).\n5. For 1.00-1.25 carat Ideal-cut diamonds, prices increase significantly with higher clarity and color grades.\n\nFor a detailed view of the analysis and visualizations, you can access the [RPub Document here](https://rpubs.com/Rijul-Grover/1217027).\n\nThese insights provide valuable information for understanding diamond characteristics and pricing, particularly useful for engagement ring selection.","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Frijul007%2Fdiamonds-analysis-using-r","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Frijul007%2Fdiamonds-analysis-using-r","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Frijul007%2Fdiamonds-analysis-using-r/lists"}