Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/curran/data
A collection of public data sets
https://github.com/curran/data
Last synced: 14 days ago
JSON representation
A collection of public data sets
- Host: GitHub
- URL: https://github.com/curran/data
- Owner: curran
- License: mit
- Created: 2013-07-31T17:10:43.000Z (over 11 years ago)
- Default Branch: gh-pages
- Last Pushed: 2024-02-08T00:25:25.000Z (10 months ago)
- Last Synced: 2024-10-15T11:07:22.585Z (about 2 months ago)
- Language: HTML
- Size: 60.6 MB
- Stars: 506
- Watchers: 23
- Forks: 180
- Open Issues: 7
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
A collection of public data sets for testing out visualization methods. These data sets are at various stages of preparation, some are just raw data, some are CSV files, and some are exposed as AMD modules. This collection is messy, but with some digging you may find hidden gems.
## Interesting Datasets
Most recently added on the top.
* [Gothenburg Quality of Government Expert Survey](https://www.gu.se/en/quality-government/qog-data/data-downloads/qog-expert-survey)
* [UN Human Development Index](https://hdr.undp.org/data-center/human-development-index#/indicies/HDI)
* [Vega Datasets](https://github.com/vega/vega-datasets/tree/main)
* [Data sources by EfoxMaps](https://www.notion.so/a360dea317234868a0f7cfb1ef249843)
* [RawGraphs Sample Datasets](https://github.com/rawgraphs/rawgraphs-app/tree/master/public/sample-datasets)
* [nomis - UK Census data from 2021](https://www.nomisweb.co.uk/sources/census_2021_bulk) (via [Twitter](https://twitter.com/undertheraedar/status/1612751365343961090))
* [Cantometrics Data, from The Global Jukebox, music database](https://github.com/theglobaljukebox/cantometrics/tree/main/raw)
* [Observable Curated Datasets](https://observablehq.com/@observablehq/curated-datasets) | [Old Observable Curated Datasets](https://github.com/observablehq/datasets)
* [MFRED Electricity Usage Dataset](https://dataverse.harvard.edu/dataset.xhtml?persistentId=doi:10.7910/DVN/X9MIDJ)
* [BBC Shared Data Unit](https://github.com/BBC-Data-Unit/shared-data-unit)
* [Most Popular Operating Systems](https://observablehq.com/@mbostock/most-popular-operating-systems-2003-2020)
* [Remaking Figures from Semiology of Graphics](https://github.com/nicolaskruchten/semiology_of_graphics)
* [Awesome Public Datasets](https://github.com/awesomedata/awesome-public-datasets)
* [FMA: A Dataset For Music Analysis](https://github.com/mdeff/fma)
* [Food Nutrition Data](https://fdc.nal.usda.gov/download-datasets.html)
* [Historical Weather Warnings](https://mesonet.agron.iastate.edu/request/gis/watchwarn.phtml)
* [PM2.5 Air Quality by Country over Time](https://github.com/maurosc3ner/uspm25_2000_2018/blob/master/data/pm2.5byCounty.csv)
* [US Energy Information Administration Data](https://www.eia.gov/electricity/data.php#sales) (see also [Analysis & Projections](https://www.eia.gov/electricity/data/eia860M/))
* [Climate.gov Datasets](https://www.climate.gov/maps-data/datasets)
* [The Economist Graphic Detail data](https://github.com/TheEconomist/graphic-detail-data)
* [Dataset collection: SPORTS DATA SETS FOR DATA MODELING, VISUALIZATION, PREDICTIONS, MACHINE-LEARNING](https://sports-statistics.com/sports-statistics-datasets-for-research-modeling-predictions-machine-learning-ai/).
* [Dataset collection: information is beautiful - Data](https://informationisbeautiful.net/data/)
* [Dataset collection: R for Data Science Tidy Tuesdays](https://github.com/rfordatascience/tidytuesday)
* [Stranger Things Ratings](https://data.world/priyankad0993/stranger-things-episode-ratings)
* [SIPRI Arms Transfers Database](https://www.sipri.org/databases/armstransfers)
* [CWUR - World University Rankings 2019-2020](https://cwur.org/2019-2020.php)
* [TopoJSON Collection](https://bl.ocks.org/FrissAnalytics/a5b18dc15b73f34f92c7448cbb62c38e) World countries _and subdivisions_
* [Classic datasets from Petra Isenberg et. al.](https://perso.telecom-paristech.fr/eagan/class/igr204/datasets)
* [Soul of the Community](http://streaming.stat.iastate.edu/dataexpo/2013/) (American Statistical Association)
* [World Population Prospects](http://esa.un.org/wpp/Excel-Data/population.htm) (United Nations)
* [Employment](http://www.bls.gov/data/) (Bureau of Labor Statistics)
* [Healthy People](http://visualizing.org/datasets/healthy-people-2010) (Centers for Disease Control)
* [GapMinder Data](http://www.gapminder.org/data/)
* [NASA Satellite-Derived Environmental Indicators](http://sedac.ciesin.columbia.edu/data/collection/sdei)
* [IMF Public Finances in Modern History Database](http://www.imf.org/external/np/fad/histdb/)
* [Executions in the US by type over time](http://www.deathpenaltyinfo.org/views-executions)
* [Datasets used in the book, An Introduction to Categorical Data Analysis](http://lib.stat.cmu.edu/datasets/agresti)
* [Energy Information Administration Open Data](http://www.eia.gov/beta/api/)
* [Data sets from Five Thirty Eight](https://github.com/fivethirtyeight/data)
* [Data sets in the Infovis Wiki](http://www.infovis-wiki.net/index.php?title=Data_Libraries)
* [Data sets from Andy Kirk's Link Archive](http://www.visualisingdata.com/2017/02/archiving-collection-places-access-data/)
* [Makeover Monday Datasets](http://www.makeovermonday.co.uk/data/)
* [SOCR Datasets](http://wiki.socr.umich.edu/index.php/SOCR_Data)
* [UCI Machine Learning Repository Datasets](https://archive.ics.uci.edu/ml/datasets)
* [BrightKite User Check-ins](http://snap.stanford.edu/data/loc-brightkite.html) (57.2 MB)
* [ACLED (Armed Conflict Location and Event Data Project)](https://www.acleddata.com/data/) (35MB)
* [Safecast](https://blog.safecast.org/data/) (3.2GB)
* [Statistical Computing Statistical Graphics Data expo Airline on-time performance](http://stat-computing.org/dataexpo/2009/) (12GB)
* [The GDELT Data Set](https://www.gdeltproject.org/data.html#rawdatafiles) (~100GB)
* [The Indian Census 2011](http://censusindia.gov.in/2011-Common/CensusData2011.html)
* [Best Buy Developer API](https://developer.bestbuy.com/)## Leads
These are "leads" to find interesting datasets. They have teasers of cool data, but it will take some work to find the data behind them.
* [Social Explorer](https://www.socialexplorer.com/product-maps)