Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/jldbc/coffee-quality-database
Building the Coffee Quality Institute Database
https://github.com/jldbc/coffee-quality-database
agriculture coffee data data-science dataset
Last synced: 4 days ago
JSON representation
Building the Coffee Quality Institute Database
- Host: GitHub
- URL: https://github.com/jldbc/coffee-quality-database
- Owner: jldbc
- License: mit
- Created: 2018-01-20T21:02:16.000Z (almost 7 years ago)
- Default Branch: master
- Last Pushed: 2018-06-16T23:28:57.000Z (over 6 years ago)
- Last Synced: 2023-11-07T19:15:45.576Z (about 1 year ago)
- Topics: agriculture, coffee, data, data-science, dataset
- Language: R
- Size: 282 KB
- Stars: 212
- Watchers: 24
- Forks: 153
- Open Issues: 1
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
# coffee-quality-database
Digitizing 1,340 coffee reviews# Data
These data contain reviews of 1312 arabica and 28 robusta coffee beans from the Coffee Quality Institute's trained reviewers. The features include:## Quality Measures
* Aroma
* Flavor
* Aftertaste
* Acidity
* Body
* Balance
* Uniformity
* Cup Cleanliness
* Sweetness
* Moisture
* Defects## Bean Metadata
* Processing Method
* Color
* Species (arabica / robusta)## Farm Metadata
* Owner
* Country of Origin
* Farm Name
* Lot Number
* Mill
* Company
* Altitude
* RegionThe [data](https://github.com/jldbc/coffee-quality-database/tree/master/data) folder contains both raw and cleaned data. The raw data is exactly as it was found on the CQI site. Since these human-recorded data use a variety of different encodings, abbreviations, and units of measurement for their farm names, altitude, region, and other fields, I recommend using the cleaned data as a starting point.
The site was scraped using a Selenium headless browser and Beautiful Soup. To replicate this or collect updated data, create a login for the CQI site and enter your credentials in the [scraper](https://github.com/jldbc/coffee-quality-database/tree/master/scraper)
# Source
These data were collected from the Coffee Quality Institute's [review pages](https://database.coffeeinstitute.org/) in January 2018.