https://github.com/asampat3090/open-datasets
Running list of Open Datasets
https://github.com/asampat3090/open-datasets
artificial-intelligence data data-science neural-network open-datasets open-source
Last synced: 9 days ago
JSON representation
Running list of Open Datasets
- Host: GitHub
- URL: https://github.com/asampat3090/open-datasets
- Owner: asampat3090
- Created: 2017-04-25T14:52:08.000Z (almost 9 years ago)
- Default Branch: master
- Last Pushed: 2017-05-09T12:40:25.000Z (over 8 years ago)
- Last Synced: 2024-01-27T14:38:51.359Z (about 2 years ago)
- Topics: artificial-intelligence, data, data-science, neural-network, open-datasets, open-source
- Homepage:
- Size: 20.5 KB
- Stars: 23
- Watchers: 7
- Forks: 7
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
Open Datasets
=======================
[This list of public data
sources](https://github.com/acusense/open-datasets) are
collected and tidied from blogs, answers, and user responses.
We have designated each dataset as "single" or "collection" and
have designated each dataset as "free", "paid" or "credentials" (if you need to sign in to access the data but it's still free)
General
-------
- [Cornell Natural Language Visual Reasoning Dataset](http://lic.nlp.cornell.edu/nlvr/) "single" "free"
- [Structured Wikipedia Data](http://wiki.dbpedia.org/about) "collection" "free" "GNU License"
- [UCI Machine Learning Repository](http://archive.ics.uci.edu/ml/) "collection" "free"
- [Socrata Open Datasets](https://dev.socrata.com/consumers/getting-started.html) "collection" "free"
- [Datasets for Data Mining and Data Science](http://www.kdnuggets.com/datasets/index.html) "collection" "free"
- [List of datasets for machine learning research](https://en.wikipedia.org/wiki/List_of_datasets_for_machine_learning_research) "collection" "free"
- [Lexical Database for English](http://wordnet.princeton.edu/wordnet/download/) "single" "free"
- [Wolfram Data Repository](https://datarepository.wolframcloud.com/) "collection" "free"
Agriculture
-----------
- [U.S. Department of Agriculture's PLANTS
Database](http://www.plants.usda.gov/dl_all.html) "single" "free"
- [U.S. Department of Agriculture's Nutrient
Database](https://www.ars.usda.gov/northeast-area/beltsville-md/beltsville-human-nutrition-research-center/nutrient-data-laboratory/docs/sr28-download-files/) "collection" "free"
Biology
-------
- [1000 Genomes](http://www.1000genomes.org/data) "collection" "free"
- [American Gut (Microbiome
Project)](https://github.com/biocore/American-Gut) "collection" "free"
- [Broad Bioimage Benchmark Collection
(BBBC)](https://www.broadinstitute.org/bbbc) "collection" "free"
- [Broad Cancer Cell Line Encyclopedia
(CCLE)](http://www.broadinstitute.org/ccle/home) "collection" "credentials"
- [Cell Image Library](http://www.cellimagelibrary.org) "collection" "free"
- [Complete Genomics Public
Data](http://www.completegenomics.com/public-data/69-genomes/) "collection" "free"
- [EBI ArrayExpress](http://www.ebi.ac.uk/arrayexpress/) "collection" "free"
- [EBI Protein Data Bank in
Europe](http://www.ebi.ac.uk/pdbe/emdb/index.html/) "collection" "free"
- [Electron Microscopy Pilot Image Archive
(EMPIAR)](http://www.ebi.ac.uk/pdbe/emdb/empiar/) "collection" "free"
- [ENCODE project](https://www.encodeproject.org) "collection" "free"
- [Ensembl Genomes](http://ensemblgenomes.org/info/genomes) "collection" "free"
- [Gene Expression Omnibus (GEO)](http://www.ncbi.nlm.nih.gov/geo/) "collection" "free"
- [Gene Ontology
(GO)](http://geneontology.org/page/download-annotations) "collection" "free"
- [Global Biotic Interactions
(GloBI)](https://github.com/jhpoelen/eol-globi-data/wiki#accessing-species-interaction-data) "single" "free"
- [Harvard Medical School (HMS) LINCS
Project](http://lincs.hms.harvard.edu) "collection" "free"
- [Human Genome Diversity
Project](http://www.hagsc.org/hgdp/files.html) "single" "free"
- [Human Microbiome Project
(HMP)](http://www.hmpdacc.org/reference_genomes/reference_genomes.php) "collection" "free"
- [ICOS PSP Benchmark](http://ico2s.org/datasets/psp_benchmark.html) "collection" "free"
- [International HapMap
Project](http://hapmap.ncbi.nlm.nih.gov/downloads/index.html.en) "single" "free"
- [Journal of Cell Biology
DataViewer](http://jcb-dataviewer.rupress.org) "collection" "free"
- [MIT Cancer Genomics
Data](http://www.broadinstitute.org/cgi-bin/cancer/datasets.cgi) "collection" "free"
- [NCBI
Proteins](http://www.ncbi.nlm.nih.gov/guide/proteins/#databases) "collection" "credentials"
- [NCBI Taxonomy](http://www.ncbi.nlm.nih.gov/taxonomy) "single" "credentials"
- [NCI Genomic Data Commons](https://gdc-portal.nci.nih.gov) "collection" "free"
- [NIH Microarray data](http://bit.do/VVW6) or
[FTP](ftp://ftp.ncbi.nih.gov/pub/geo/DATA/supplementary/series/GSE6532/)
(see FTP link on
[RAW](https://raw.githubusercontent.com/caesar0301/awesome-public-datasets/master/README.rst)) "collection" "free"
- [OpenSNP genotypes data](https://opensnp.org/) "collection" "credentials"
- [Pathguid - Protein-Protein Interactions
Catalog](http://www.pathguide.org/) "collection" "free"
- [Protein Data Bank](http://www.rcsb.org/) "collection" "credentials"
- [Psychiatric Genomics
Consortium](https://www.med.unc.edu/pgc/downloads) "collection" "credentials"
- [PubChem Project](https://pubchem.ncbi.nlm.nih.gov/) "collection" "free"
- [PubGene (now Coremine Medical)](http://www.pubgene.org/) "collection" "credentials"
- [Sanger Catalogue of Somatic Mutations in Cancer
(COSMIC)](http://cancer.sanger.ac.uk/cosmic) "collection" "credentials"
- [Sanger Genomics of Drug Sensitivity in Cancer Project
(GDSC)](http://www.cancerrxgene.org/) "collection" "free"
- [Sequence Read
Archive(SRA)](http://www.ncbi.nlm.nih.gov/Traces/sra/) "collection" "free"
- [Stowers Institute Original Data
Repository](http://www.stowers.org/research/publications/odr) "collection" "free"
- [Systems Science of Biological Dynamics (SSBD)
Database](http://ssbd.qbic.riken.jp) "collection" "free"
- [The Cancer Genome Atlas (TCGA), available via Broad
GDAC](https://gdac.broadinstitute.org/) "collection" "free"
- [The Catalogue of
Life](http://www.catalogueoflife.org/content/annual-checklist-archive) "collection" "free"
- [The Personal Genome Project](http://www.personalgenomes.org/) or
[PGP](https://my.pgp-hms.org/public_genetic_data) "collection" "credentials"
- [UCSC Public Data](http://hgdownload.soe.ucsc.edu/downloads.html) "collection" "free"
- [UniGene](http://www.ncbi.nlm.nih.gov/unigene) "collection" "credentials"
- [Universal Protein Resource
(UnitProt)](http://www.uniprot.org/downloads) "collection" "free"
Climate/Weather
---------------
- [Actuaries Climate Index](http://actuariesclimateindex.org/data/) "single" "credentials"
- [Australian Weather](http://www.bom.gov.au/climate/dwo/) "collection" "free"
- [Aviation Weather Center - Consistent, timely and accurate weather
information for the world airspace
system](https://aviationweather.gov/adds/dataserver) "collection" "credentials"
- [Brazilian Weather - Historical data (In
Portuguese)](http://sinda.crn2.inpe.br/PCD/SITE/novo/site/) "collection" "credentials"
- [Canadian Meteorological
Centre](http://weather.gc.ca/grib/index_e.html) "collection" "free"
- [Climate Data from UEA (updated
monthly)](https://crudata.uea.ac.uk/cru/data/temperature/#datter%20and%20ftp://ftp.cmdl.noaa.gov/) "collection" "free"
- [European Climate Assessment & Dataset](http://eca.knmi.nl/) "collection" "free"
- [Global Climate Data Since 1929](http://en.tutiempo.net/climate) "collection" "free"
- [NASA Global Imagery Browse
Services](https://wiki.earthdata.nasa.gov/display/GIBS) "collection" "credentials"
- [NOAA Bering Sea Climate](http://www.beringclimate.noaa.gov/) "collection" "free"
- [NOAA Climate
Datasets](http://www.ncdc.noaa.gov/data-access/quick-links) "collection" "free"
- [NOAA Realtime Weather
Models](http://www.ncdc.noaa.gov/data-access/model-data/model-datasets/numerical-weather-prediction) "collection" "free"
- [NOAA SURFRAD Meteorology and Radiation
Datasets](https://www.esrl.noaa.gov/gmd/grad/stardata.html) "collection" "free"
- [The World Bank Open Data Resources for Climate
Change](http://data.worldbank.org/developers/climate-data-api) "collection" "free"
- [UEA Climatic Research Unit](http://www.cru.uea.ac.uk/data) "website unavailable"
- [WorldClim - Global Climate Data](http://www.worldclim.org) "single" "free"
- [WU Historical Weather
Worldwide](https://www.wunderground.com/history/index.html) "collection" "credentials"
Complex Networks
----------------
- [AMiner Citation Network Dataset](http://aminer.org/citation) "single" "free"
- [CrossRef DOI URLs](https://archive.org/details/doi-urls) "single" "credentials"
- [DBLP Citation
dataset](https://kdl.cs.umass.edu/display/public/DBLP) "single" "credentials"
- [DIMACS Road Networks
Collection](http://www.dis.uniroma1.it/challenge9/download.shtml) "collection" "free"
- [NBER Patent Citations](http://nber.org/patents/) "collection" "free"
- [Network Repository with Interactive Exploratory Analysis
Tools](http://networkrepository.com/) "collection" "credentials"
- [NIST complex networks data
collection](http://math.nist.gov/~RPozo/complex_datasets.html) "collection" "free"
- [Protein-protein interaction
network](http://vlado.fmf.uni-lj.si/pub/networks/data/bio/Yeast/Yeast.htm) "collection" "free"
- [PyPI and Maven Dependency
Network](https://ogirardot.wordpress.com/2013/01/31/sharing-pypimaven-dependency-data/) "collection" "free"
- [Scopus Citation
Database](https://www.elsevier.com/solutions/scopus) "single" "payment"
- [Small Network Data](http://www-personal.umich.edu/~mejn/netdata/) "collection" "free"
- [Stanford GraphBase (Steven
Skiena)](http://www3.cs.stonybrook.edu/~algorith/implement/graphbase/implement.shtml) "collection" "free"
- [Stanford Large Network Dataset
Collection](http://snap.stanford.edu/data/) "collection" "free"
- [Stanford Longitudinal Network Data
Sources](http://stanford.edu/group/sonia/dataSources/index.html) "collection" "free"
- [The Koblenz Network Collection](http://konect.uni-koblenz.de/) "collection" "free"
- [The Laboratory for Web Algorithmics
(UNIMI)](http://law.di.unimi.it/datasets.php) "collection" "free"
- [UCI Network Data
Repository](https://networkdata.ics.uci.edu/resources.php) "collection" "free"
- [UFL sparse matrix
collection](http://www.cise.ufl.edu/research/sparse/matrices/) "collection" "free"
- [WSU Graph Database](http://www.eecs.wsu.edu/mgd/gdb.html) "collection" "free"
Computer Networks
-----------------
- [3.5B Web Pages from CommonCrawl
2012](http://www.bigdatanews.com/profiles/blogs/big-data-set-3-5-billion-web-pages-made-available-for-all-of-us) "collection" "credentials"
- [53.5B Web clicks of 100K users in Indiana
Univ.](http://cnets.indiana.edu/groups/nan/webtraffic/click-dataset/) "single" "credentials"
- [CAIDA Internet Datasets](http://www.caida.org/data/overview/) "collection" "free"
- [ClueWeb09 - 1B web pages](http://lemurproject.org/clueweb09/) "single" "credentials"
- [ClueWeb12 - 733M web pages](http://lemurproject.org/clueweb12/) "single" "credentials"
- [CommonCrawl Web Data over 7
years](http://commoncrawl.org/the-data/get-started/) "collection" "payment"
- [CRAWDAD Wireless datasets from Dartmouth
Univ.](https://crawdad.cs.dartmouth.edu/) "collection" "credentials"
- [Criteo click-through
data](http://labs.criteo.com/2015/03/criteo-releases-its-new-dataset/) "collection" "free"
- [OONI: Open Observatory of Network Interference - Internet
censorship data](https://ooni.torproject.org/data/) "collection" "free"
- [Open Mobile Data by
MobiPerf](https://console.developers.google.com/storage/openmobiledata_public/) "collection" "payment"
- [Rapid7 Sonar Internet Scans](https://sonar.labs.rapid7.com/) "single" "free"
- [UCSD Network Telescope, IPv4 /8
net](http://www.caida.org/projects/network_telescope/) "collection" "payment"
Data Challenges
---------------
- [Bruteforce
Database](https://github.com/duyetdev/bruteforce-database) "collection" "payment"
- [Challenges in Machine Learning](http://www.chalearn.org/) "collection" "free"
- [CrowdANALYTIX dataX](http://data.crowdanalytix.com) "collection" "credentials"
- [D4D Challenge of Orange](http://www.d4d.orange.com/en/home) "collection" "credentials"
- [DrivenData Competitions for Social
Good](http://www.drivendata.org/) "collection" "credentials"
- [ICWSM Data Challenge (since 2009)](http://icwsm.cs.umbc.edu/) "collection" "credentials"
- [Kaggle Competition Data](https://www.kaggle.com/) "collection" "credentials"
- [KDD Cup by Tencent 2012](http://www.kddcup2012.org/) "collection" "credentials"
- [Localytics Data Visualization
Challenge](https://github.com/localytics/data-viz-challenge) "collection" "payment"
- [Netflix Prize](http://netflixprize.com/leaderboard.html) "single" "free"
- [Space Apps Challenge](https://2015.spaceappschallenge.org) "single" "free"
- [Telecom Italia Big Data
Challenge](https://dandelion.eu/datamine/open-big-data/) "collection" "credentials"
- [TravisTorrent Dataset - MSR'2017 Mining
Challenge](https://travistorrent.testroots.org/) "collection" "free"
- [Yelp Dataset Challenge](http://www.yelp.com/dataset_challenge) "single" "credentials"
Earth Science
-------------
- [AQUASTAT - Global water resources and
uses](http://www.fao.org/nr/water/aquastat/data/query/index.html?lang=en) "collection" "free"
- [BODC - marine data of \~22K vars](https://www.bodc.ac.uk/data/) "collection" "credentials"
- [Earth Models](http://www.earthmodels.org/) "collection" "credentials"
- [EOSDIS - NASA's earth observing system
data](http://sedac.ciesin.columbia.edu/data/sets/browse) "collection" "credentials"
- [Integrated Marine Observing System (IMOS) - roughly 30TB of ocean
measurements](https://imos.aodn.org.au) or [on
S3](http://imos-data.s3-website-ap-southeast-2.amazonaws.com/) "collection" "free"
- [Marinexplore - Open Oceanographic Data](http://marinexplore.org/) "collection" "credentials"
- [Smithsonian Institution Global Volcano and Eruption
Database](http://volcano.si.edu/) "collection" "free"
- [USGS Earthquake
Archives](http://earthquake.usgs.gov/earthquakes/search/) "collection" "free"
Economics
---------
- [American Economic Association
(AEA)](https://www.aeaweb.org/resources/data) "collection" "credentials"
- [EconData from
UMD](http://inforumweb.umd.edu/econdata/econdata.html)
- [Economic Freedom of the World
Data](http://www.freetheworld.com/datasets_efw.html) "collection" "payment"
- [Historical MacroEconomc
Statistics](http://www.historicalstatistics.org/) "collection" "free"
- [International Economics Database](http://widukind.cepremap.org/)
and [various data tools](https://github.com/Widukind) "collection" "free"
- [International Trade Statistics](http://www.econostatistics.co.za/) "collection" "free"
- [Internet Product Code Database](http://www.upcdatabase.com/) "collection" "credentials"
- [Joint External Debt Data Hub](http://www.jedh.org/) "collection" "free"
- [Jon Haveman International Trade Data
Links](http://www.macalester.edu/research/economics/PAGE/HAVEMAN/Trade.Resources/TradeData.html) "collection" "free"
- [OpenCorporates Database of Companies in the
World](https://opencorporates.com/) "collection" "credentials"
- [Our World in Data](http://ourworldindata.org/) "collection" "free"
- [SciencesPo World Trade Gravity
Datasets](http://econ.sciences-po.fr/thierry-mayer/data) "collection" "free"
- [The Atlas of Economic Complexity](http://atlas.cid.harvard.edu) "collection" "free"
- [The Center for International Data](http://cid.econ.ucdavis.edu) "collection" "free"
- [The Observatory of Economic
Complexity](http://atlas.media.mit.edu/en/) "collection" "free"
- [UN Commodity Trade Statistics](http://comtrade.un.org/db/) "collection" "credentials"
- [UN Human Development Reports](http://hdr.undp.org/en) "collection" "free"
Education
---------
- [College Scorecard Data](https://collegescorecard.ed.gov/data/) "single" "free"
- [Student Data from Free Code
Camp](http://academictorrents.com/details/030b10dad0846b5aecc3905692890fb02404adbf) "single" "credentials"
Energy
------
- [AMPds](http://ampds.org/) "single" "free"
- [COMBED](http://combed.github.io/) "single" "free"
- [DRED](http://www.st.ewi.tudelft.nl/~akshay/dred/) "collection" "credentials"
- [ECO](http://www.vs.inf.ethz.ch/res/show.html?what=eco-data) "single" "free"
- [EIA](http://www.eia.gov/electricity/data/eia923/) "collection" "free"
- [HES](http://randd.defra.gov.uk/Default.aspx?Menu=Menu&Module=More&Location=None&ProjectID=17359&FromSearch=Y&Publisher=1&SearchText=EV0702&SortString=ProjectCode&SortOrder=Asc&Paging=10#Description) "single" "free"
- Household Electricity Study, UK
- [HFED](http://hfed.github.io/) "collection" "free"
- [iAWE](http://iawe.github.io/) "single" "free"
- [PLAID](http://plaidplug.com/) - the Plug Load Appliance
Identification Dataset "single" "free"
- [REDD](http://redd.csail.mit.edu/) "collection" "free"
- [Tracebase](https://www.tracebase.org) "collection" "free"
- [UK-DALE](http://www.doc.ic.ac.uk/~dk3810/data/) - UK Domestic
Appliance-Level Electricity "single" "free"
- [WHITED](http://nilmworkshop.org/2016/proceedings/Poster_ID18.pdf) "single" "free"
Finance
-------
- [CBOE Futures Exchange](http://cfe.cboe.com/Data/) "collection" "credentials"
- [Google Finance](https://www.google.com/finance) "collection" "credentials"
- [Google
Trends](http://www.google.com/trends?q=google&ctab=0&geo=all&date=all&sort=0) "collection" "credentials"
- [NASDAQ](https://data.nasdaq.com/) "collection" "credentials"
- [NYSE Market Data](ftp://ftp.nyxdata.com) (see FTP link on
[RAW](https://raw.githubusercontent.com/caesar0301/awesome-public-datasets/master/README.rst)) "collection" "free"
- [OANDA](http://www.oanda.com/) "collection" "credentials"
- [OSU Financial data](http://fisher.osu.edu/fin/fdf/osudata.htm) "collection" "free"
- [Quandl](https://www.quandl.com/) "collection" "credentials"
- [St Louis Federal](https://research.stlouisfed.org/fred2/) "collection" "credentials"
- [Yahoo Finance](http://finance.yahoo.com/) "collection" "credentials"
GIS
---
- [ArcGIS Open Data portal](http://opendata.arcgis.com/) "collection" "credentials"
- [Cambridge, MA, US, GIS data on
GitHub](http://cambridgegis.github.io/gisdata.html) "collection" "credentials"
- [Factual Global Location Data](https://www.factual.com/) "collection" "credentials"
- [Geo Spatial Data from ASU](http://geodacenter.asu.edu/datalist/) "collection" "credentials"
- [Geo Wiki Project - Citizen-driven Environmental
Monitoring](http://geo-wiki.org/) "collection" "credentials"
- [GeoFabrik - OSM data extracted to a variety of formats and
areas](http://download.geofabrik.de/) "collection" "free"
- [GeoNames Worldwide](http://www.geonames.org/) "collection" "credentials"
- [Global Administrative Areas Database (GADM)](http://www.gadm.org/) "collection" "free"
- [Homeland Infrastructure Foundation-Level
Data](https://hifld-dhs-gii.opendata.arcgis.com/) "collection" "free"
- [Landsat 8 on AWS](https://aws.amazon.com/public-data-sets/landsat/) "collection" "credentials"
- [List of all countries in all
languages](https://github.com/umpirsky/country-list) "collection" "payment"
- [National Weather Service GIS Data
Portal](http://www.nws.noaa.gov/gis/) "collection" "free"
- [Natural Earth - vectors and rasters of the
world](http://www.naturalearthdata.com/) "collection" "free"
- [OpenAddresses](http://openaddresses.io/) "collection" "free"
- [OpenStreetMap
(OSM)](http://wiki.openstreetmap.org/wiki/Downloading_data) "collection" "free"
- [Pleiades - Gazetteer and graph of ancient
places](http://pleiades.stoa.org/) "collection" "credentials"
- [Reverse Geocoder using OSM
data](https://github.com/kno10/reversegeocode) "collection" "payment" & [additional
high-resolution data files](http://data.ub.uni-muenchen.de/61/) "collection" "credentials"
- [TIGER/Line - U.S. boundaries and
roads](http://www.census.gov/geo/maps-data/data/tiger-line.html) "collection" "free"
- [TwoFishes - Foursquare's coarse
geocoder](https://github.com/foursquare/twofishes) "collection" "payment"
- [TZ Timezones shapfiles](http://efele.net/maps/tz/world/) "collection" "free"
- [UN Environmental Data](http://geodata.grid.unep.ch/) "collection" "credentials"
- [World countries in multiple
formats](https://github.com/mledoze/countries) "collection" "payment"
Government
----------
- [A list of cities and countries contributed by
community](https://github.com/caesar0301/awesome-public-datasets/blob/master/Government.rst) "collection" "payment"
- [Open Data for Africa](http://opendataforafrica.org/) "collection" "free"
- [OpenDataSoft's list of 1,600 open
data](https://www.opendatasoft.com/a-comprehensive-list-of-all-open-data-portals-around-the-world/) "collection" "payment"
Healthcare
----------
- [EHDP Large Health Data
Sets](http://www.ehdp.com/vitalnet/datasets.htm) "collection" "free"
- [Gapminder World demographic
databases](http://www.gapminder.org/data/) "collection" "free"
- [Medicare Coverage Database (MCD),
U.S.](https://www.cms.gov/medicare-coverage-database/) "collection" "free"
- [Medicare Data Engine of medicare.gov
Data](https://data.medicare.gov/) "collection" "credentials"
- [Medicare Data File](http://go.cms.gov/19xxPN4) "single" "free"
- [MeSH, the vocabulary thesaurus used for indexing articles for
PubMed](https://www.nlm.nih.gov/mesh/filelist.html) "collection" "free"
- [Number of Ebola Cases and Deaths in Affected Countries
(2014)](https://data.hdx.rwlabs.org/dataset/ebola-cases-2014) "collection" "credentials"
- [Open-ODS (structure of the UK NHS)](http://www.openods.co.uk) "single" "free"
- [OpenPaymentsData, Healthcare financial relationship
data](https://openpaymentsdata.cms.gov) "collection" "credentials"
- [The Cancer Genome Atlas project
(TCGA)](https://tcga-data.nci.nih.gov/tcga/tcgaDownload.jsp) and
[BigQuery
table](http://google-genomics.readthedocs.org/en/latest/use_cases/discover_public_data/isb_cgc_data.html) "collection" "free"
- [World Health Organization Global Health
Observatory](http://www.who.int/gho/en/) "collection" "free"
Image Processing
----------------
- [10k US Adult Faces
Database](http://wilmabainbridge.com/facememorability2.html) "collection" "credentials"
- [2GB of Photos of
Cats](http://137.189.35.203/WebUI/CatDatabase/catData.html) or
[Archive
version](https://web.archive.org/web/20150520175645/http://137.189.35.203/WebUI/CatDatabase/catData.html) "collection" "free"
- [Adience Unfiltered faces for gender and age
classification](http://www.openu.ac.il/home/hassner/Adience/data.html) "collection" "credentials"
- [Affective Image Classification](http://www.imageemotion.org/) "collection" "free"
- [Animals with attributes](http://attributes.kyb.tuebingen.mpg.de/) "single" "free"
- [Caltech Pedestrian Detection
Benchmark](https://www.vision.caltech.edu/Image_Datasets/CaltechPedestrians/) "collection" "free"
- [Chars74K dataset, Character Recognition in Natural Images (both
English and Kannada are
available)](http://www.ee.surrey.ac.uk/CVSSP/demos/chars74k/) "collection" "free"
- [Face Recognition Benchmark](http://www.face-rec.org/databases/) "collection" "free"
- [GDXray: X-ray images for X-ray testing and Computer
Vision](http://dmery.ing.puc.cl/index.php/material/gdxray/) "collection" "free"
- [ImageNet (in WordNet hierarchy)](http://www.image-net.org/) "collection" "free"
- [Indoor Scene
Recognition](http://web.mit.edu/torralba/www/indoor.html) "collection" "free"
- [International Affective Picture System,
UFL](http://csea.phhp.ufl.edu/media/iapsmessage.html) "collection" "free"
- [Massive Visual Memory Stimuli,
MIT](http://cvcl.mit.edu/MM/stimuli.html) "collection" "free"
- [MNIST database of handwritten digits, near 1 million
examples](http://yann.lecun.com/exdb/mnist/) "collection" "free"
- [Stanford Dogs
Dataset](http://vision.stanford.edu/aditya86/ImageNetDogs/) "single" "free"
- [SUN database,
MIT](http://groups.csail.mit.edu/vision/SUN/hierarchy.html) "collection" "free"
- [The Action Similarity Labeling (ASLAN)
Challenge](http://www.openu.ac.il/home/hassner/data/ASLAN/ASLAN.html) "collection" "free"
- [The Oxford-IIIT Pet
Dataset](http://www.robots.ox.ac.uk/~vgg/data/pets/) "single" "free"
- [Violent-Flows - Crowd Violence Non-violence Database and
benchmark](http://www.openu.ac.il/home/hassner/data/violentflows/) "collection" "credentials"
- [Visual genome](http://visualgenome.org/api/v0/api_home.html) "single" "free"
- [YouTube Faces Database](http://www.cs.tau.ac.il/~wolf/ytfaces/) "collection" "credentials"
Machine Learning
----------------
- [Context-aware data sets from five
domains](https://github.com/irecsys/CARSKit/tree/master/context-aware_data_sets) "collection" "payment"
- [Delve Datasets for classification and regression (Univ. of
Toronto)](http://www.cs.toronto.edu/~delve/data/datasets.html) "collection" "free"
- [Discogs Monthly Data](http://data.discogs.com/) "collection" "free"
- [eBay Online Auctions
(2012)](http://www.modelingonlineauctions.com/datasets) "collection" "free"
- [IMDb Database](http://www.imdb.com/interfaces) "collection" "credentials"
- [Keel Repository for classification, regression and time
series](http://sci2s.ugr.es/keel/datasets.php) "collection" "free"
- [Labeled Faces in the Wild (LFW)](http://vis-www.cs.umass.edu/lfw/) "collection" "free"
- [Lending Club Loan
Data](https://www.lendingclub.com/info/download-data.action) "collection" "credentials"
- [Machine Learning Data Set Repository](http://mldata.org/) "collection" "credentials"
- [Million Song Dataset](http://labrosa.ee.columbia.edu/millionsong/) "single" "free"
- [More Song
Datasets](http://labrosa.ee.columbia.edu/millionsong/pages/additional-datasets) "collection" "free"
- [MovieLens Data Sets](http://grouplens.org/datasets/movielens/) "collection" "free"
- [New Yorker caption contest
ratings](https://github.com/nextml/caption-contest-data) "collection" "payment"
- [RDataMining - "R and Data Mining" ebook
data](http://www.rdatamining.com/data) "collection" "free"
- [Restaurants Health Score Data in San
Francisco](http://missionlocal.org/san-francisco-restaurant-health-inspections/) "single" "free"
- [UCI Machine Learning Repository](http://archive.ics.uci.edu/ml/) "collection" "free"
- [Yahoo! Ratings and Classification
Data](http://webscope.sandbox.yahoo.com/catalog.php?datatype=r) "collection" "credentials"
- [Youtube 8m](https://research.google.com/youtube8m/download.html) "single" "free"
Museums
-------
- [Canada Science and Technology Museums Corporation's Open
Data](http://techno-science.ca/en/data.php) "collection" "free"
- [Cooper-Hewitt's Collection
Database](https://github.com/cooperhewitt/collection) "collection" "payment"
- [Minneapolis Institute of Arts
metadata](https://github.com/artsmia/collection) "collection" "payment"
- [Natural History Museum (London) Data
Portal](http://data.nhm.ac.uk/) "collection" "free"
- [Rijksmuseum Historical Art
Collection](https://www.rijksmuseum.nl/en/api) "single" "free"
- [Tate Collection
metadata](https://github.com/tategallery/collection) "collection" "payment"
- [The Getty vocabularies](http://vocab.getty.edu) "collection" "free"
Natural Language
----------------
- [Automatic Keyphrase
Extracttion](https://github.com/snkim/AutomaticKeyphraseExtraction/) "collection" "payment"
- [Blogger Corpus](http://u.cs.biu.ac.il/~koppel/BlogCorpus.htm) "single" "free"
- [CLiPS Stylometry Investigation
Corpus](http://www.clips.uantwerpen.be/datasets/csi-corpus) "single" "credentials"
- [ClueWeb09 FACC](http://lemurproject.org/clueweb09/FACC1/) "single" "free"
- [ClueWeb12 FACC](http://lemurproject.org/clueweb12/FACC1/) "single" "free"
- [DBpedia - 4.58M things with 583M
facts](http://wiki.dbpedia.org/Datasets) "collection" "free"
- [Flickr Personal
Taxonomies](http://www.isi.edu/~lerman/downloads/flickr/flickr_taxonomies.html) "collection" "free"
- [Freebase.com of people, places, and
things](http://www.freebase.com/) "collection" "free"
- [Google Books Ngrams
(2.2TB)](https://aws.amazon.com/datasets/google-books-ngrams/) "single" "credentials"
- [Google MC-AFP, generated based on the public available Gigaword
dataset using Paragraph Vectors](https://github.com/google/mcafp) "collection" "payment"
- [Google Web 5gram (1TB,
2006)](https://catalog.ldc.upenn.edu/LDC2006T13) "collection" "credentials"
- [Gutenberg eBooks
List](http://www.gutenberg.org/wiki/Gutenberg:Offline_Catalogs) "collection" "free"
- [Hansards text chunks of Canadian
Parliament](http://www.isi.edu/natural-language/download/hansard/) "collection" "free"
- [Machine Comprehension Test (MCTest) of text from Microsoft
Research](http://research.microsoft.com/en-us/um/redmond/projects/mctest/index.html) "single" "free"
- [Machine Translation of European
languages](http://statmt.org/wmt11/translation-task.html#download) "collection" "free"
- [Microsoft MAchine Reading COmprehension Dataset (or MS
MARCO)](http://www.msmarco.org/dataset.aspx) "single" "free"
- [Multi-Domain Sentiment Dataset (version
2.0)](http://www.cs.jhu.edu/~mdredze/datasets/sentiment/) "single" "free"
- [Open Multilingual Wordnet](http://compling.hss.ntu.edu.sg/omw/) "collection" "free"
- [Personae
Corpus](http://www.clips.uantwerpen.be/datasets/personae-corpus) "single" "credentials"
- [SaudiNewsNet Collection of Saudi Newspaper Articles (Arabic, 30K
articles)](https://github.com/ParallelMazen/SaudiNewsNet) "collection" "payment"
- [SMS Spam Collection in
English](http://www.dt.fee.unicamp.br/~tiago/smsspamcollection/) "single" "free"
- [Universal Dependencies](http://universaldependencies.org) "collection" "free"
- [USENET postings corpus of
2005\~2011](http://www.psych.ualberta.ca/~westburylab/downloads/usenetcorpus.download.html) "single" "credentials"
- [Webhose - News/Blogs in multiple
languages](https://webhose.io/datasets) "collection" "credentials"
- [Wikidata - Wikipedia
databases](https://www.wikidata.org/wiki/Wikidata:Database_download) "collection" "free"
- [Wikipedia Links data - 40 Million Entities in
Context](https://code.google.com/p/wiki-links/downloads/list) "collection" "free"
- [WordNet databases and
tools](http://wordnet.princeton.edu/wordnet/download/) "collection" "free"
Neuroscience
------------
- [Allen Institute Datasets](http://www.brain-map.org/) "collection" "free"
- [Brain Catalogue](http://braincatalogue.org/) "collection" "credentials"
- [Brainomics](http://brainomics.cea.fr/localizer) "single" "credentials"
- [Collaborative Research in Computational Neuroscience
(CRCNS)](http://crcns.org/data-sets) "collection" "free"
- [FCP-INDI](http://fcon_1000.projects.nitrc.org/index.html) "collection" "credentials"
- [Human Connectome Project](http://www.humanconnectome.org/data/) "collection" "free"
- [NDAR](https://ndar.nih.gov/) "collection" "credentials"
- [NeuroData](http://neurodata.io) "collection" "free"
- [Neuroelectro](http://neuroelectro.org/) "collection" "free"
- [NIMH Data Archive](http://data-archive.nimh.nih.gov/) "collection" "free"
- [OASIS](http://www.oasis-brains.org/) "collection" "free"
- [OpenfMRI](https://openfmri.org/) "collection" "credentials"
- [Study Forrest](http://studyforrest.org) "collection" "free"
Physics
-------
- [CERN Open Data Portal](http://opendata.cern.ch/) "collection" "free"
- [Crystallography Open Database](http://www.crystallography.net/) "collection" "free"
- [NASA Exoplanet Archive](http://exoplanetarchive.ipac.caltech.edu/) "collection" "credentials"
- [NSSDC (NASA) data of 550 space
spacecraft](http://nssdc.gsfc.nasa.gov/nssdc/obtaining_data.html) "collection" "free"
- [Sloan Digital Sky Survey (SDSS) - Mapping the
Universe](http://www.sdss.org/) "collection" "free"
Public Domains
--------------
- [Amazon](http://aws.amazon.com/datasets/) "collection" "credentials"
- [Archive-it from Internet
Archive](https://www.archive-it.org/explore?show=Collections) "collection" "credentials"
- [Archive.org Datasets](https://archive.org/details/datasets) "collection" "credentials"
- [CMU JASA data archive](http://lib.stat.cmu.edu/jasadata/) "collection" "free"
- [CMU StatLab collections](http://lib.stat.cmu.edu/datasets/) "collection" "free"
- [Data.World](https://data.world) "collection" "credentials"
- [Data360](http://www.data360.org/index.aspx) "collection" "free"
- [Google](http://www.google.com/publicdata/directory) "collection" "credentials"
- [Infochimps](http://www.infochimps.com/) "collection" "free"
- [KDNuggets Data
Collections](http://www.kdnuggets.com/datasets/index.html) "collection" "free"
- [Microsoft Azure Data Market Free
DataSets](http://datamarket.azure.com/browse/data?price=free) "collection" "credentials"
- [Microsoft Data Science for Research](http://aka.ms/Data-Science) "collection" "free"
- [Open Library Data Dumps](https://openlibrary.org/developers/dumps) "collection" "credentials"
- [Reddit Datasets](https://www.reddit.com/r/datasets) "collection" "credentials"
- [RevolutionAnalytics
Collection](http://packages.revolutionanalytics.com/datasets/) "collection" "free"
- [Sample R data
sets](http://stat.ethz.ch/R-manual/R-patched/library/datasets/html/00Index.html) "collection" "free"
- [StatSci.org](http://www.statsci.org/datasets.html) "collection" "free"
- [The Washington Post
List](http://www.washingtonpost.com/wp-srv/metro/data/datapost.html) "collection" "credentials"
- [UCLA SOCR data
collection](http://wiki.stat.ucla.edu/socr/index.php/SOCR_Data) "collection" "credentials"
- [UFO Reports](http://www.nuforc.org/webreports.html) "collection" "free"
- [Wikileaks 911 pager
intercepts](https://911.wikileaks.org/files/index.html) "collection" "free"
- [Yahoo Webscope](http://webscope.sandbox.yahoo.com/catalog.php) "collection" "free"
Search Engines
--------------
- [Academic Torrents of data sharing from
UMB](http://academictorrents.com/) "collection" "credentials"
- [Datahub.io](https://datahub.io/dataset) "collection" "credentials"
- [DataMarket (Qlik)](https://datamarket.com/data/list/?q=all) "collection" "credentials"
- [Harvard Dataverse Network of scientific
data](https://dataverse.harvard.edu/) "collection" "credentials"
- [ICPSR (UMICH)](http://www.icpsr.umich.edu/icpsrweb/ICPSR/index.jsp) "collection" "credentials"
- [Institute of Education Sciences](http://eric.ed.gov) "collection" "free"
- [National Technical Reports
Library](http://www.ntis.gov/products/ntrl/) "collection" "free"
- [Open Data Certificates
(beta)](https://certificates.theodi.org/en/datasets) "collection" "credentials"
- [OpenDataNetwork - A search engine of all Socrata powered data
portals](http://www.opendatanetwork.com/) "collection" "free"
- [Statista.com - statistics and Studies](http://www.statista.com/) "collection" "credentials"
- [Zenodo - An open dependable home for the long-tail of
science](https://zenodo.org/collection/datasets) "collection" "credentials"
Social Networks
---------------
- [72 hours \#gamergate Twitter
Scrape](http://waxy.org/random/misc/gamergate_tweets.csv) "collection" "credentials"
- [Ancestry.com Forum Dataset over 10
years](http://www.cs.cmu.edu/~jelsas/data/ancestry.com/) "single" "free"
- [Cheng-Caverlee-Lee September 2009 - January 2010 Twitter
Scrape](https://archive.org/details/twitter_cikm_2010) "single" "free"
- [CMU Enron Email of 150 users](http://www.cs.cmu.edu/~enron/) "single" "free"
- [EDRM Enron EMail of 151 users, hosted on
S3](https://aws.amazon.com/datasets/enron-email-data/) "single" "credentials"
- [Facebook Data Scrape
(2005)](https://archive.org/details/oxford-2005-facebook-matrix) "single" "credentials"
- [Facebook Social Networks from LAW (since
2007)](http://law.di.unimi.it/datasets.php) "collection" "free"
- [Foursquare from UMN/Sarwat
(2013)](https://archive.org/details/201309_foursquare_dataset_umn) "single" "credentials"
- [GitHub Collaboration Archive](https://www.githubarchive.org/) "collection" "free"
- [Google Scholar citation
relations](http://www3.cs.stonybrook.edu/~leman/data/gscholar.db) "single" "free"
- [High-Resolution Contact Networks from Wearable
Sensors](http://www.sociopatterns.org/datasets/) "collection" "free"
- [Mobile Social Networks from
UMASS](https://kdl.cs.umass.edu/display/public/Mobile+Social+Networks) "single" "credentials"
- [Network Twitter
Data](http://snap.stanford.edu/data/higgs-twitter.html) "single" "free"
- [Reddit
Comments](https://www.reddit.com/r/datasets/comments/3bxlg7/i_have_every_publicly_available_reddit_comment/) "single" "credentials"
- [Skytrax' Air Travel Reviews
Dataset](https://github.com/quankiquanki/skytrax-reviews-dataset) "single" "payment"
- [Social Twitter
Data](http://snap.stanford.edu/data/egonets-Twitter.html) "single" "free"
- [SourceForge.net Research
Data](http://www3.nd.edu/~oss/Data/data.html) "collection" "free"
- [Twitter Data for Online Reputation
Management](http://nlp.uned.es/replab2013/) "single" "free"
- [Twitter Data for Sentiment
Analysis](http://help.sentiment140.com/for-students/) "collection" "free"
- [Twitter Graph of entire Twitter
site](http://an.kaist.ac.kr/traces/WWW2010.html) "single" "free"
- [UNIMI/LAW Social Network
Datasets](http://law.di.unimi.it/datasets.php) "collection" "free"
- [Yahoo! Graph and Social
Data](http://webscope.sandbox.yahoo.com/catalog.php?datatype=g) "collection" "credentials"
- [Youtube Video Social Graph in
2007,2008](http://netsg.cs.sfu.ca/youtubedata/) "single" "free"
Social Sciences
---------------
- [ACLED (Armed Conflict Location & Event Data
Project)](http://www.acleddata.com/) "collection" "free"
- [Canadian Legal Information
Institute](https://www.canlii.org/en/index.php) "collection" "free"
- [Center for Systemic Peace Datasets - Conflict Trends, Polities,
State Fragility, etc](http://www.systemicpeace.org/) "collection" "free"
- [Correlates of War Project](http://www.correlatesofwar.org/) "collection" "credentials"
- [Cryptome Conspiracy Theory Items](http://cryptome.org) "collection" "payment"
- [Datacards](http://datacards.org) "collection" "credentials"
- [European Social Survey](http://www.europeansocialsurvey.org/data/) "collection" "credentials"
- [FBI Hate Crime 2013 - aggregated
data](https://github.com/emorisse/FBI-Hate-Crime-Statistics/tree/master/2013) "collection" "payment"
- [Fragile States Index](http://fsi.fundforpeace.org/data) "collection" "payment"
- [GDELT Global Events Database](http://gdeltproject.org/data.html) "collection" "free"
- [General Social Survey (GSS) since 1972](http://gss.norc.org) "collection" "free"
- [German Social Survey](http://www.gesis.org/en/home/) "collection" "free"
- [Global Religious Futures
Project](http://www.globalreligiousfutures.org/) "collection" "free"
- [Humanitarian Data Exchange](https://data.hdx.rwlabs.org/) "collection" "credentials"
- [INFORM Index for Risk
Management](http://www.inform-index.org/Results/Global) "collection" "credentials"
- [Institute for Demographic Studies](http://www.ined.fr/en/) "collection" "free"
- [International Networks Archive](http://www.princeton.edu/~ina/) "collection" "free"
- [International Social Survey Program ISSP](http://www.issp.org) "collection" "free"
- [International Studies Compendium
Project](http://www.isacompendium.com/public/) "collection" "credentials"
- [James McGuire Cross National
Data](http://jmcguire.faculty.wesleyan.edu/welcome/cross-national-data/) "collection" "free"
- [MacroData Guide by Norsk samfunnsvitenskapelig
datatjeneste](http://nsd.uib.no) "collection" "free"
- [Minnesota Population Center](https://www.ipums.org/) "collection" "payment"
- [MIT Reality Mining
Dataset](http://realitycommons.media.mit.edu/realitymining.html) "single" "free"
- [Notre Dame Global Adaptation Index
(NG-DAIN)](http://index.gain.org/about/download) "collection" "free"
- [Open Crime and Policing Data in England, Wales and Northern
Ireland](https://data.police.uk/data/) "collection" "free"
- [Paul Hensel General International Data
Page](http://www.paulhensel.org/dataintl.html) "collection" "free"
- [PewResearch Internet Survey
Project](http://www.pewinternet.org/datasets/) "collection" "free"
- [PewResearch Society Data
Collection](http://www.pewresearch.org/data/download-datasets/) "collection" "free"
- [Political Polarity
Data](http://www3.cs.stonybrook.edu/~leman/data/14-icwsm-political-polarity-data.zip) "single" "free"
- [StackExchange Data Explorer](http://data.stackexchange.com/help) "collection" "credentials"
- [Terrorism Research and Analysis
Consortium](http://www.trackingterrorism.org/) "collection" "credentials"
- [Texas Inmates Executed Since
1984](http://www.tdcj.state.tx.us/death_row/dr_executed_offenders.html) "single" "free"
- [Titanic Survival Data
Set](https://github.com/caesar0301/awesome-public-datasets/tree/master/Datasets) or [on Kaggle](https://www.kaggle.com/c/titanic/data) "single" "payment"
- [UCB's Archive of Social Science Data
(D-Lab)](http://ucdata.berkeley.edu/) "collection" "free"
- [UCLA Social Sciences Data
Archive](http://dataarchives.ss.ucla.edu/Home.DataPortals.htm) "collection" "free"
- [UN Civil Society Database](http://esango.un.org/civilsociety/) "collection" "free"
- [Universities Worldwide](http://univ.cc/) "collection" "free"
- [UPJOHN for Labor Employment
Research](http://www.upjohn.org/services/resources/employment-research-data-center) "collection" "free"
- [Uppsala Conflict Data Program](http://ucdp.uu.se/) "collection" "free"
- [World Bank Open Data](http://data.worldbank.org/) "collection" "free"
- [WorldPop project - Worldwide human population
distributions](http://www.worldpop.org.uk/data/get_data/) "collection" "free"
Software
--------
- [FLOSSmole data about free, libre, and open source software
development](http://flossdata.syr.edu/data/) "collection" "free"
Sports
------
- [Basketball (NBA/NCAA/Euro) Player Database and
Statistics](http://www.draftexpress.com/stats.php) "collection" "credentials"
- [Betfair Historical Exchange Data](http://data.betfair.com/) "collection" "credentials"
- [Cricsheet Matches (cricket)](http://cricsheet.org/) "collection" "free"
- [Ergast Formula 1, from 1950 up to date
(API)](http://ergast.com/mrd/db) "collection" "credentials"
- [Football/Soccer resources (data and
APIs)](http://www.jokecamp.com/blog/guide-to-football-and-soccer-data-and-apis/) "collection" "free"
- [Lahman's Baseball
Database](http://www.seanlahman.com/baseball-archive/statistics/) "collection" "free"
- [Pinhooker: Thoroughbred Bloodstock Sale
Data](https://github.com/phillc73/pinhooker) "collection" "payment"
- [Retrosheet Baseball Statistics](http://www.retrosheet.org/game.htm) "collection" free"
- [Tennis database of rankings, results, and stats for
ATP](https://github.com/JeffSackmann/tennis_atp),
[WTA](https://github.com/JeffSackmann/tennis_wta), [Grand
Slams](https://github.com/JeffSackmann/tennis_slam_pointbypoint) and
[Match Charting
Project](https://github.com/JeffSackmann/tennis_MatchChartingProject) "collection" "payment"
Time Series
-----------
- [Databanks International Cross National Time Series Data
Archive](http://www.cntsdata.com) "collection" "payment"
- [Hard Drive Failure
Rates](https://www.backblaze.com/hard-drive-test-data.html) "collection" "credentials"
- [Heart Rate Time Series from MIT](http://ecg.mit.edu/time-series/) "collection" "free"
- [Time Series Data Library (TSDL) from
MU](https://datamarket.com/data/list/?q=provider:tsdl) "collection" "credentials"
- [UC Riverside Time Series
Dataset](http://www.cs.ucr.edu/~eamonn/time_series_data/) "collection" "free"
Transportation
--------------
- [Airlines OD Data
1987-2008](http://stat-computing.org/dataexpo/2009/the-data.html) "collection" "credentials"
- [Bay Area Bike Share
Data](http://www.bayareabikeshare.com/open-data) "collection" "credentials"
- [Bike Share Systems (BSS)
collection](https://github.com/BetaNYC/Bike-Share-Data-Best-Practices/wiki/Bike-Share-Data-Systems) "collection" "payment"
- [GeoLife GPS Trajectory from Microsoft
Research](http://research.microsoft.com/en-us/downloads/b16d359d-d164-469e-9fd4-daa38f2b2e13/) "collection" "credentials"
- [German train system by Deutsche
Bahn](http://data.deutschebahn.com/datasets/) "collection" "free"
- [Hubway Million Rides in
MA](http://hubwaydatachallenge.org/trip-history-data/) "single" "free"
- [Marine Traffic - ship tracks, port calls and
more](http://www.marinetraffic.com/de/ais-api-services) "collection" "credentials"
- [Montreal BIXI Bike Share](https://montreal.bixi.com/en/open-data) "single" "free"
- [NYC Taxi Trip Data
2009-](http://www.nyc.gov/html/tlc/html/about/trip_record_data.shtml) "collection" "free"
- [NYC Taxi Trip Data 2013
(FOIA/FOILed)](https://archive.org/details/nycTaxiTripData2013) "single" "credentials"
- [NYC Uber trip data April 2014 to September
2014](https://github.com/fivethirtyeight/uber-tlc-foil-response) "collection" "payment"
- [Open Traffic
collection](https://github.com/graphhopper/open-traffic-collection) "collection" "credentials"
- [OpenFlights - airport, airline and route
data](http://openflights.org/data.html) "collection" "free"
- [Philadelphia Bike Share Stations
(JSON)](https://www.rideindego.com/stations/json/) "single" "free"
- [Plane Crash Database, since
1920](http://www.planecrashinfo.com/database.htm) "collection" "free"
- [RITA Airline On-Time Performance
data](http://www.transtats.bts.gov/Tables.asp?DB_ID=120) "collection" "free"
- [RITA/BTS transport data collection
(TranStat)](http://www.transtats.bts.gov/DataIndex.asp) "collection" "free"
- [Transport for London
(TFL)](https://tfl.gov.uk/info-for/open-data-users/our-open-data) "collection" "free"
- [Travel Tracker Survey (TTS) for
Chicago](http://www.cmap.illinois.gov/data/transportation/travel-tracker-survey) "collection" "free"
- [U.S. Bureau of Transportation Statistics
(BTS)](http://www.rita.dot.gov/bts/) "collection" "free"
- [U.S. Domestic Flights 1990 to
2009](http://academictorrents.com/details/a2ccf94bbb4af222bf8e69dad60a68a29f310d9a) "single" "payment"
- [U.S. Freight Analysis Framework since
2007](http://ops.fhwa.dot.gov/freight/freight_analysis/faf/index.htm) "collection" "free"
Complementary Collections
-------------------------
- [Data Packaged Core Datasets](https://github.com/datasets/) "collection" "payment"
- [Database of Scientific Code
Contributions](https://mozillascience.org/collaborate) "collection" "free"
- A growing collection of public datasets:
[CoolDatasets.](http://cooldatasets.com/) "collection" "free"
- Inside-r: [Finding Data on the
Internet](http://www.inside-r.org/howto/finding-data-internet) "collection" "free"
- OpenDataMonitor: [An overview of available open data resources in
Europe](http://opendatamonitor.eu) "collection" "free"
- Quora: [Where can I find large datasets open to the
public?](http://www.quora.com/Where-can-I-find-large-datasets-open-to-the-public) "collection" "credentials"
- RS.io: [100+ Interesting Data Sets for
Statistics](http://rs.io/100-interesting-data-sets-for-statistics/) "collection" "free"
- StaTrek: [Leveraging open data to understand urban
lives](http://xiaming.me/posts/2014/10/23/leveraging-open-data-to-understand-urban-lives/) "collection" "free"