An open API service indexing awesome lists of open source software.

https://github.com/szapp/candyanalysis

Case study: Analyze the candy power ranking to identify and recommend popular candy characteristics
https://github.com/szapp/candyanalysis

data-analysis data-visualization feature-selection interaction-terms

Last synced: about 2 months ago
JSON representation

Case study: Analyze the candy power ranking to identify and recommend popular candy characteristics

Awesome Lists containing this project

README

          

# Candy Analysis

Case study: Analyze the [candy power ranking](https://github.com/fivethirtyeight/data/tree/master/candy-power-ranking) to identify and recommend popular candy characteristics.

The dataset by FiveThirtyEight is distributed un the Creative Commons Attribution 4.0 International license (https://creativecommons.org/licenses/by/4.0/).

## Project

The production of a new candy is planned.
Among the project team there is no consensus about the characteristics of the candy.

Based on a dataset from market analysis, the task is to give a clear recommendation for what characteristics the new product should express.

## Deliverable

The results are compiled in a [presentation](Presentation.pdf) with a clear recommendation.
The presentation is in German, but the numbers speak for themselves.






## Libraries used

- Scipy
- Scikit-learn
- Seaborn

*See [requirements.txt](requirements.txt).*

## Challenges

- Small dataset (86 rows/samples)
- Data is aggregated over brands (e.g. win percentage)
- Study design might not be fair (not blind)

## Approach

Treating the problem not as a regression but as a classification and using statistical analysis allows to identify features that are statistically dependent with popular brands.
With interaction terms, the combination of successful characteristics can be recommended.