https://github.com/skrub-data/datasets
skrub (previously dirty-cat) related dataset files. Includes script, raw datasets, etc.
https://github.com/skrub-data/datasets
Last synced: 8 months ago
JSON representation
skrub (previously dirty-cat) related dataset files. Includes script, raw datasets, etc.
- Host: GitHub
- URL: https://github.com/skrub-data/datasets
- Owner: skrub-data
- Created: 2018-10-03T08:58:34.000Z (over 7 years ago)
- Default Branch: master
- Last Pushed: 2024-07-12T13:53:04.000Z (almost 2 years ago)
- Last Synced: 2025-03-31T04:24:05.384Z (about 1 year ago)
- Language: Python
- Homepage:
- Size: 385 KB
- Stars: 1
- Watchers: 4
- Forks: 8
- Open Issues: 3
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
# Datasets
Download and denormalization scripts for skrub datasets.
Contains also:
- Correspondence table between KEN Embeddings and their figshare download ID[[1]](#1).
- Happiness score dataset from the World Happiness Report 2022[[2]](#2).
- Bike sharing dataset from the UCI Machine Learning Repository[[3]](#3).
## References
[1]
https://soda-inria.github.io/ken_embeddings/
[2]
Helliwell, J. F., Layard, R., Sachs, J. D., De Neve, J.-E., Aknin, L. B., & Wang, S. (Eds.). (2022).
[World Happiness Report 2022](https://worldhappiness.report/ed/2022/). New York: Sustainable Development Solutions Network.
[2]
Fanaee-T,Hadi. (2013). Bike Sharing. UCI Machine Learning Repository. https://doi.org/10.24432/C5W894.