https://github.com/coatless/raw-data
https://github.com/coatless/raw-data
Last synced: 3 months ago
JSON representation
- Host: GitHub
- URL: https://github.com/coatless/raw-data
- Owner: coatless
- Created: 2022-01-19T22:37:06.000Z (over 4 years ago)
- Default Branch: main
- Last Pushed: 2024-10-16T04:26:35.000Z (over 1 year ago)
- Last Synced: 2025-12-13T09:54:47.942Z (6 months ago)
- Language: HTML
- Homepage: https://coatless.github.io/raw-data/
- Size: 43.9 MB
- Stars: 1
- Watchers: 1
- Forks: 1
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
# raw-data
Within this repository, we include a collection of data in "raw form" or "flat file format". The goal is to be able to ingest this data into a statistical computing environment like _R_ or _Python_.
## CSV
- [Salaries.csv](Salaries.csv) from the [`carData`](https://cran.r-project.org/package=carData) _R_ package. ([Details](https://cran.r-project.org/web/packages/carData/carData.pdf#page=43))
- [pima.csv](pima.csv) from the [`faraway`](https://cran.r-project.org/package=faraway) _R_ package. ([Details](https://cran.r-project.org/web/packages/faraway/faraway.pdf#page=74))
- [gap-every-five-years.csv](gap-every-five-years.csv) from [`gapminder`](https://github.com/jennybc/gapminder/tree/main/data-raw)
- [xbox-7-day-auctions.csv](xbox-7-day-auctions.csv) from [modelingonlineauctions.com/](http://www.modelingonlineauctions.com/datasets).
- [1976-2020-senate.csv](1976-2020-senate.csv)
- [surreal-residual.csv](surreal-residual.csv) based on [Residual (Sur)Realism by Leonard A Stefanski (2012)](https://doi.org/10.1198/000313007X190079)
- [vehicles.csv](vehicles.csv) from the [`fueleconomy`](https://cran.r-project.org/package=fueleconomy) _R_ package.
- [common.csv](common.csv) from the [`fueleconomy`](https://cran.r-project.org/package=fueleconomy) _R_ package.
- [flights.csv](Salaries.csv) from the [`nycflights13`](https://cran.r-project.org/package=nycflights13) _R_ package. ([Details](https://nycflights13.tidyverse.org/reference/flights.html))
- [ucla-binary-enrollment.csv](ucla-binary-enrollment.csv) from [UCLA's OARC](https://stats.oarc.ucla.edu)
- [tips.csv](tips.csv) from [`seaborn`'s data repository](https://github.com/mwaskom/seaborn-data/blob/master/tips.csv)
- [ramen-ratings-cleaned.csv](ramen-ratings-cleaned.csv) from [Kaggle's `residentmario/ramen-ratings`](https://www.kaggle.com/datasets/residentmario/ramen-ratings) (with minimal cleaning applied).
- [diamonds.csv](diamonds.csv) from [`ggplot2`'s data-raw directory](https://github.com/tidyverse/ggplot2/blob/main/data-raw/diamonds.csv)
## Text
- [subject_heights_tab.txt](subject_heights_tab.txt)
- [subject_heights.txt](subject_heights.txt)
## SQL
- [northwind-dump.sql](northwind-dump.sql) from [**jpwhite3/northwind-SQLite3**](https://github.com/jpwhite3/northwind-SQLite3). ([EER Diagram](https://raw.githubusercontent.com/jpwhite3/northwind-SQLite3/master/Northwind_ERD.png))
- [lahman2016.sqlite](lahman2016.sqlite) from [**jknecht/baseball-archive-sqlite**](https://github.com/jknecht/baseball-archive-sqlite/blob/master/lahman2016.sqlite) ([Details](https://www.seanlahman.com/baseball-archive/statistics/))
- [lahman2019.sqlite](lahman2019.sqlite) from [**WebucatorTraining/lahman-baseball-mysql**](https://github.com/WebucatorTraining/lahman-baseball-mysql/) ([Details](https://www.seanlahman.com/baseball-archive/statistics/), [EER Diagram](https://raw.githubusercontent.com/WebucatorTraining/lahman-baseball-mysql/master/lahman-model.png))
## Excel
- [subject_heights.xlsx](subject_heights.xlsx) built to showcase multiple sheets inside of a single workbook.
## HTML
- [webpage_sample.html](webpage_sample.html)
## Parquet
- [2010-flights-summary.parquet.zip](2010-flights-summary.parquet.zip)