Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/rijul007/diamonds-analysis-using-r
Diamonds data analysis using R, exploring relationships between diamond attributes (such as carat, cut, color, and clarity) and price, with a focus on providing insights for engagement ring selection through various statistical techniques and data visualizations including histograms, boxplots, scatter plots, and bar charts.
https://github.com/rijul007/diamonds-analysis-using-r
data-analysis data-science
Last synced: 8 days ago
JSON representation
Diamonds data analysis using R, exploring relationships between diamond attributes (such as carat, cut, color, and clarity) and price, with a focus on providing insights for engagement ring selection through various statistical techniques and data visualizations including histograms, boxplots, scatter plots, and bar charts.
- Host: GitHub
- URL: https://github.com/rijul007/diamonds-analysis-using-r
- Owner: rijul007
- Created: 2024-09-06T13:00:01.000Z (2 months ago)
- Default Branch: master
- Last Pushed: 2024-09-06T13:00:05.000Z (2 months ago)
- Last Synced: 2024-09-15T13:36:26.778Z (2 months ago)
- Topics: data-analysis, data-science
- Homepage: https://rpubs.com/Rijul-Grover/1217027
- Size: 4.88 KB
- Stars: 0
- Watchers: 1
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
# Diamond Analysis Project
## Objective
To analyze the characteristics and pricing of diamonds using the built-in `diamonds` dataset from the `ggplot2` library in R. The project aims to answer specific questions about diamond attributes and their relationship to price, particularly focusing on engagement ring selection.## Dataset Used
The analysis uses the `diamonds` dataset from the `ggplot2` library in R. This dataset contains information on 53,940 round-cut diamonds, including attributes such as carat weight, cut quality, color, clarity, and price.## Analysis Techniques
The project employs various data analysis and visualization techniques using R:
1. **Data Loading and Exploration**:
- Utilized **`tidyverse`** and **`dplyr`** for data manipulation
- Used **`glimpse()`** function to review data structure2. **Data Visualization**:
- Employed **`ggplot2`** for creating various types of plots:
- Histograms and boxplots for price distribution
- Boxplots for comparing table values across cut grades
- Pie charts for clarity segment analysis
- Scatter plots with trend lines for price-carat correlation
- Bar charts for average price analysis
- Used **`patchwork`** library for combining multiple plots3. **Statistical Analysis**:
- Calculated percentages and averages using **`dplyr`** functions like `group_by()`, `summarise()`, and `mutate()`
- Applied sampling techniques using **`sample_n()`** function4. **Data Transformation**:
- Created new segments for clarity using **`mutate()`** and conditional statements
- Filtered data for specific analyses using **`filter()`** function5. **Advanced Visualization Techniques**:
- Implemented faceting with **`facet_wrap()`** for multi-panel plots
- Used **`geom_smooth()`** for adding trend lines to scatter plots
- Applied custom color palettes and themes for enhanced aesthetics6. **Scale and Axis Manipulation**:
- Utilized **`scale_x_continuous()`** and **`scale_y_continuous()`** for customizing axis scales
- Implemented **`scale_fill_manual()`** and **`scale_color_manual()`** for custom color schemes## Results
1. The distribution of diamond prices is positively skewed with outliers.
2. Ideal-cut diamonds have a narrower range of table values compared to other cut grades.
3. Only 3.3% of the diamonds in the dataset are categorized as Internally Flawless (IF).
4. There's a positive correlation between carat weight and price for colorless diamonds (D, E, F colors).
5. For 1.00-1.25 carat Ideal-cut diamonds, prices increase significantly with higher clarity and color grades.For a detailed view of the analysis and visualizations, you can access the [RPub Document here](https://rpubs.com/Rijul-Grover/1217027).
These insights provide valuable information for understanding diamond characteristics and pricing, particularly useful for engagement ring selection.