{"id":42103564,"url":"https://github.com/asampat3090/open-datasets","last_synced_at":"2026-01-26T13:09:59.952Z","repository":{"id":150262562,"uuid":"89372893","full_name":"asampat3090/open-datasets","owner":"asampat3090","description":"Running list of Open Datasets","archived":false,"fork":false,"pushed_at":"2017-05-09T12:40:25.000Z","size":21,"stargazers_count":23,"open_issues_count":0,"forks_count":7,"subscribers_count":7,"default_branch":"master","last_synced_at":"2024-01-27T14:38:51.359Z","etag":null,"topics":["artificial-intelligence","data","data-science","neural-network","open-datasets","open-source"],"latest_commit_sha":null,"homepage":"","language":null,"has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/asampat3090.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null}},"created_at":"2017-04-25T14:52:08.000Z","updated_at":"2023-10-01T08:27:35.000Z","dependencies_parsed_at":"2023-04-12T17:16:51.537Z","dependency_job_id":null,"html_url":"https://github.com/asampat3090/open-datasets","commit_stats":{"total_commits":7,"total_committers":2,"mean_commits":3.5,"dds":0.1428571428571429,"last_synced_commit":"cfe6f8291f465cefccd4ecde54a7872a4e04d5a1"},"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"purl":"pkg:github/asampat3090/open-datasets","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/asampat3090%2Fopen-datasets","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/asampat3090%2Fopen-datasets/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/asampat3090%2Fopen-datasets/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/asampat3090%2Fopen-datasets/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/asampat3090","download_url":"https://codeload.github.com/asampat3090/open-datasets/tar.gz/refs/heads/master","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/asampat3090%2Fopen-datasets/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":28778932,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-01-26T11:46:04.308Z","status":"ssl_error","status_checked_at":"2026-01-26T11:46:02.664Z","response_time":59,"last_error":"SSL_connect returned=1 errno=0 peeraddr=140.82.121.5:443 state=error: unexpected eof while reading","robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":false,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["artificial-intelligence","data","data-science","neural-network","open-datasets","open-source"],"created_at":"2026-01-26T13:09:59.363Z","updated_at":"2026-01-26T13:09:59.945Z","avatar_url":"https://github.com/asampat3090.png","language":null,"funding_links":[],"categories":[],"sub_categories":[],"readme":"Open Datasets\n=======================\n\n[This list of public data\nsources](https://github.com/acusense/open-datasets) are\ncollected and tidied from blogs, answers, and user responses.\nWe have designated each dataset as \"single\" or \"collection\" and\nhave designated each dataset as \"free\", \"paid\" or \"credentials\" (if you need to sign in to access the data but it's still free)\n\nGeneral\n-------\n-   [Cornell Natural Language Visual Reasoning Dataset](http://lic.nlp.cornell.edu/nlvr/) \"single\" \"free\"\n-   [Structured Wikipedia Data](http://wiki.dbpedia.org/about) \"collection\" \"free\" \"GNU License\"\n-   [UCI Machine Learning Repository](http://archive.ics.uci.edu/ml/) \"collection\" \"free\"\n-   [Socrata Open Datasets](https://dev.socrata.com/consumers/getting-started.html) \"collection\" \"free\"\n-   [Datasets for Data Mining and Data Science](http://www.kdnuggets.com/datasets/index.html)   \"collection\" \"free\"\n-   [List of datasets for machine learning research](https://en.wikipedia.org/wiki/List_of_datasets_for_machine_learning_research) \"collection\" \"free\"\n-   [Lexical Database for English](http://wordnet.princeton.edu/wordnet/download/) \"single\" \"free\"\n-   [Wolfram Data Repository](https://datarepository.wolframcloud.com/) \"collection\" \"free\"\n\nAgriculture\n-----------\n\n-   [U.S. Department of Agriculture's PLANTS\n    Database](http://www.plants.usda.gov/dl_all.html) \"single\" \"free\"\n-   [U.S. Department of Agriculture's Nutrient\n    Database](https://www.ars.usda.gov/northeast-area/beltsville-md/beltsville-human-nutrition-research-center/nutrient-data-laboratory/docs/sr28-download-files/) \"collection\" \"free\"\n\nBiology\n-------\n\n-   [1000 Genomes](http://www.1000genomes.org/data) \"collection\" \"free\"\n-   [American Gut (Microbiome\n    Project)](https://github.com/biocore/American-Gut) \"collection\" \"free\"\n-   [Broad Bioimage Benchmark Collection\n    (BBBC)](https://www.broadinstitute.org/bbbc) \"collection\" \"free\"\n-   [Broad Cancer Cell Line Encyclopedia\n    (CCLE)](http://www.broadinstitute.org/ccle/home) \"collection\" \"credentials\"\n-   [Cell Image Library](http://www.cellimagelibrary.org) \"collection\" \"free\"\n-   [Complete Genomics Public\n    Data](http://www.completegenomics.com/public-data/69-genomes/)  \"collection\" \"free\"\n-   [EBI ArrayExpress](http://www.ebi.ac.uk/arrayexpress/) \"collection\" \"free\"\n-   [EBI Protein Data Bank in\n    Europe](http://www.ebi.ac.uk/pdbe/emdb/index.html/) \"collection\" \"free\"\n-   [Electron Microscopy Pilot Image Archive\n    (EMPIAR)](http://www.ebi.ac.uk/pdbe/emdb/empiar/) \"collection\" \"free\"\n-   [ENCODE project](https://www.encodeproject.org) \"collection\" \"free\"\n-   [Ensembl Genomes](http://ensemblgenomes.org/info/genomes) \"collection\" \"free\"\n-   [Gene Expression Omnibus (GEO)](http://www.ncbi.nlm.nih.gov/geo/) \"collection\" \"free\"\n-   [Gene Ontology\n    (GO)](http://geneontology.org/page/download-annotations) \"collection\" \"free\"\n-   [Global Biotic Interactions\n    (GloBI)](https://github.com/jhpoelen/eol-globi-data/wiki#accessing-species-interaction-data) \"single\" \"free\"\n-   [Harvard Medical School (HMS) LINCS\n    Project](http://lincs.hms.harvard.edu) \"collection\" \"free\"\n-   [Human Genome Diversity\n    Project](http://www.hagsc.org/hgdp/files.html) \"single\" \"free\"\n-   [Human Microbiome Project\n    (HMP)](http://www.hmpdacc.org/reference_genomes/reference_genomes.php) \"collection\" \"free\"\n-   [ICOS PSP Benchmark](http://ico2s.org/datasets/psp_benchmark.html) \"collection\" \"free\"\n-   [International HapMap\n    Project](http://hapmap.ncbi.nlm.nih.gov/downloads/index.html.en) \"single\" \"free\"\n-   [Journal of Cell Biology\n    DataViewer](http://jcb-dataviewer.rupress.org) \"collection\" \"free\"\n-   [MIT Cancer Genomics\n    Data](http://www.broadinstitute.org/cgi-bin/cancer/datasets.cgi) \"collection\" \"free\"\n-   [NCBI\n    Proteins](http://www.ncbi.nlm.nih.gov/guide/proteins/#databases) \"collection\" \"credentials\"\n-   [NCBI Taxonomy](http://www.ncbi.nlm.nih.gov/taxonomy) \"single\" \"credentials\"\n-   [NCI Genomic Data Commons](https://gdc-portal.nci.nih.gov) \"collection\" \"free\"\n-   [NIH Microarray data](http://bit.do/VVW6) or\n    [FTP](ftp://ftp.ncbi.nih.gov/pub/geo/DATA/supplementary/series/GSE6532/)\n    (see FTP link on\n    [RAW](https://raw.githubusercontent.com/caesar0301/awesome-public-datasets/master/README.rst)) \"collection\" \"free\"\n-   [OpenSNP genotypes data](https://opensnp.org/) \"collection\" \"credentials\"\n-   [Pathguid - Protein-Protein Interactions\n    Catalog](http://www.pathguide.org/) \"collection\" \"free\"\n-   [Protein Data Bank](http://www.rcsb.org/) \"collection\" \"credentials\"\n-   [Psychiatric Genomics\n    Consortium](https://www.med.unc.edu/pgc/downloads) \"collection\" \"credentials\"\n-   [PubChem Project](https://pubchem.ncbi.nlm.nih.gov/) \"collection\" \"free\"\n-   [PubGene (now Coremine Medical)](http://www.pubgene.org/) \"collection\" \"credentials\"\n-   [Sanger Catalogue of Somatic Mutations in Cancer\n    (COSMIC)](http://cancer.sanger.ac.uk/cosmic) \"collection\" \"credentials\"\n-   [Sanger Genomics of Drug Sensitivity in Cancer Project\n    (GDSC)](http://www.cancerrxgene.org/) \"collection\" \"free\"\n-   [Sequence Read\n    Archive(SRA)](http://www.ncbi.nlm.nih.gov/Traces/sra/) \"collection\" \"free\"\n-   [Stowers Institute Original Data\n    Repository](http://www.stowers.org/research/publications/odr) \"collection\" \"free\"\n-   [Systems Science of Biological Dynamics (SSBD)\n    Database](http://ssbd.qbic.riken.jp) \"collection\" \"free\"\n-   [The Cancer Genome Atlas (TCGA), available via Broad\n    GDAC](https://gdac.broadinstitute.org/) \"collection\" \"free\"\n-   [The Catalogue of\n    Life](http://www.catalogueoflife.org/content/annual-checklist-archive) \"collection\" \"free\"\n-   [The Personal Genome Project](http://www.personalgenomes.org/) or\n    [PGP](https://my.pgp-hms.org/public_genetic_data) \"collection\" \"credentials\"\n-   [UCSC Public Data](http://hgdownload.soe.ucsc.edu/downloads.html) \"collection\" \"free\"\n-   [UniGene](http://www.ncbi.nlm.nih.gov/unigene) \"collection\" \"credentials\"\n-   [Universal Protein Resource\n    (UnitProt)](http://www.uniprot.org/downloads) \"collection\" \"free\"\n\nClimate/Weather\n---------------\n\n-   [Actuaries Climate Index](http://actuariesclimateindex.org/data/) \"single\" \"credentials\"\n-   [Australian Weather](http://www.bom.gov.au/climate/dwo/) \"collection\" \"free\"\n-   [Aviation Weather Center - Consistent, timely and accurate weather\n    information for the world airspace\n    system](https://aviationweather.gov/adds/dataserver) \"collection\" \"credentials\"\n-   [Brazilian Weather - Historical data (In\n    Portuguese)](http://sinda.crn2.inpe.br/PCD/SITE/novo/site/) \"collection\" \"credentials\"\n-   [Canadian Meteorological\n    Centre](http://weather.gc.ca/grib/index_e.html) \"collection\" \"free\"\n-   [Climate Data from UEA (updated\n    monthly)](https://crudata.uea.ac.uk/cru/data/temperature/#datter%20and%20ftp://ftp.cmdl.noaa.gov/) \"collection\" \"free\"\n-   [European Climate Assessment \u0026 Dataset](http://eca.knmi.nl/) \"collection\" \"free\"\n-   [Global Climate Data Since 1929](http://en.tutiempo.net/climate) \"collection\" \"free\"\n-   [NASA Global Imagery Browse\n    Services](https://wiki.earthdata.nasa.gov/display/GIBS) \"collection\" \"credentials\"\n-   [NOAA Bering Sea Climate](http://www.beringclimate.noaa.gov/) \"collection\" \"free\"\n-   [NOAA Climate\n    Datasets](http://www.ncdc.noaa.gov/data-access/quick-links) \"collection\" \"free\"\n-   [NOAA Realtime Weather\n    Models](http://www.ncdc.noaa.gov/data-access/model-data/model-datasets/numerical-weather-prediction) \"collection\" \"free\"\n-   [NOAA SURFRAD Meteorology and Radiation\n    Datasets](https://www.esrl.noaa.gov/gmd/grad/stardata.html) \"collection\" \"free\"\n-   [The World Bank Open Data Resources for Climate\n    Change](http://data.worldbank.org/developers/climate-data-api) \"collection\" \"free\"\n-   [UEA Climatic Research Unit](http://www.cru.uea.ac.uk/data) \"website unavailable\"\n-   [WorldClim - Global Climate Data](http://www.worldclim.org) \"single\" \"free\"\n-   [WU Historical Weather\n    Worldwide](https://www.wunderground.com/history/index.html) \"collection\" \"credentials\"\n\nComplex Networks\n----------------\n\n-   [AMiner Citation Network Dataset](http://aminer.org/citation) \"single\" \"free\"\n-   [CrossRef DOI URLs](https://archive.org/details/doi-urls) \"single\" \"credentials\"\n-   [DBLP Citation\n    dataset](https://kdl.cs.umass.edu/display/public/DBLP) \"single\" \"credentials\"\n-   [DIMACS Road Networks\n    Collection](http://www.dis.uniroma1.it/challenge9/download.shtml) \"collection\" \"free\"\n-   [NBER Patent Citations](http://nber.org/patents/) \"collection\" \"free\"\n-   [Network Repository with Interactive Exploratory Analysis\n    Tools](http://networkrepository.com/) \"collection\" \"credentials\"\n-   [NIST complex networks data\n    collection](http://math.nist.gov/~RPozo/complex_datasets.html) \"collection\" \"free\"\n-   [Protein-protein interaction\n    network](http://vlado.fmf.uni-lj.si/pub/networks/data/bio/Yeast/Yeast.htm) \"collection\" \"free\"\n-   [PyPI and Maven Dependency\n    Network](https://ogirardot.wordpress.com/2013/01/31/sharing-pypimaven-dependency-data/) \"collection\" \"free\"\n-   [Scopus Citation\n    Database](https://www.elsevier.com/solutions/scopus) \"single\" \"payment\"\n-   [Small Network Data](http://www-personal.umich.edu/~mejn/netdata/) \"collection\" \"free\"\n-   [Stanford GraphBase (Steven\n    Skiena)](http://www3.cs.stonybrook.edu/~algorith/implement/graphbase/implement.shtml) \"collection\" \"free\"\n-   [Stanford Large Network Dataset\n    Collection](http://snap.stanford.edu/data/) \"collection\" \"free\"\n-   [Stanford Longitudinal Network Data\n    Sources](http://stanford.edu/group/sonia/dataSources/index.html) \"collection\" \"free\"\n-   [The Koblenz Network Collection](http://konect.uni-koblenz.de/) \"collection\" \"free\"\n-   [The Laboratory for Web Algorithmics\n    (UNIMI)](http://law.di.unimi.it/datasets.php) \"collection\" \"free\"\n-   [UCI Network Data\n    Repository](https://networkdata.ics.uci.edu/resources.php) \"collection\" \"free\"\n-   [UFL sparse matrix\n    collection](http://www.cise.ufl.edu/research/sparse/matrices/) \"collection\" \"free\"\n-   [WSU Graph Database](http://www.eecs.wsu.edu/mgd/gdb.html) \"collection\" \"free\"\n\nComputer Networks\n-----------------\n\n-   [3.5B Web Pages from CommonCrawl\n    2012](http://www.bigdatanews.com/profiles/blogs/big-data-set-3-5-billion-web-pages-made-available-for-all-of-us) \"collection\" \"credentials\"\n-   [53.5B Web clicks of 100K users in Indiana\n    Univ.](http://cnets.indiana.edu/groups/nan/webtraffic/click-dataset/) \"single\" \"credentials\"\n-   [CAIDA Internet Datasets](http://www.caida.org/data/overview/) \"collection\" \"free\"\n-   [ClueWeb09 - 1B web pages](http://lemurproject.org/clueweb09/) \"single\" \"credentials\"\n-   [ClueWeb12 - 733M web pages](http://lemurproject.org/clueweb12/) \"single\" \"credentials\"\n-   [CommonCrawl Web Data over 7\n    years](http://commoncrawl.org/the-data/get-started/) \"collection\" \"payment\"\n-   [CRAWDAD Wireless datasets from Dartmouth\n    Univ.](https://crawdad.cs.dartmouth.edu/) \"collection\" \"credentials\"\n-   [Criteo click-through\n    data](http://labs.criteo.com/2015/03/criteo-releases-its-new-dataset/) \"collection\" \"free\"\n-   [OONI: Open Observatory of Network Interference - Internet\n    censorship data](https://ooni.torproject.org/data/) \"collection\" \"free\"\n-   [Open Mobile Data by\n    MobiPerf](https://console.developers.google.com/storage/openmobiledata_public/) \"collection\" \"payment\"\n-   [Rapid7 Sonar Internet Scans](https://sonar.labs.rapid7.com/) \"single\" \"free\"\n-   [UCSD Network Telescope, IPv4 /8\n    net](http://www.caida.org/projects/network_telescope/) \"collection\" \"payment\"\n\nData Challenges\n---------------\n\n-   [Bruteforce\n    Database](https://github.com/duyetdev/bruteforce-database) \"collection\" \"payment\"\n-   [Challenges in Machine Learning](http://www.chalearn.org/) \"collection\" \"free\"\n-   [CrowdANALYTIX dataX](http://data.crowdanalytix.com) \"collection\" \"credentials\"\n-   [D4D Challenge of Orange](http://www.d4d.orange.com/en/home) \"collection\" \"credentials\"\n-   [DrivenData Competitions for Social\n    Good](http://www.drivendata.org/) \"collection\" \"credentials\"\n-   [ICWSM Data Challenge (since 2009)](http://icwsm.cs.umbc.edu/) \"collection\" \"credentials\"\n-   [Kaggle Competition Data](https://www.kaggle.com/) \"collection\" \"credentials\"\n-   [KDD Cup by Tencent 2012](http://www.kddcup2012.org/) \"collection\" \"credentials\"\n-   [Localytics Data Visualization\n    Challenge](https://github.com/localytics/data-viz-challenge) \"collection\" \"payment\"\n-   [Netflix Prize](http://netflixprize.com/leaderboard.html) \"single\" \"free\"\n-   [Space Apps Challenge](https://2015.spaceappschallenge.org) \"single\" \"free\"\n-   [Telecom Italia Big Data\n    Challenge](https://dandelion.eu/datamine/open-big-data/) \"collection\" \"credentials\"\n-   [TravisTorrent Dataset - MSR'2017 Mining\n    Challenge](https://travistorrent.testroots.org/) \"collection\" \"free\"\n-   [Yelp Dataset Challenge](http://www.yelp.com/dataset_challenge) \"single\" \"credentials\"\n\nEarth Science\n-------------\n\n-   [AQUASTAT - Global water resources and\n    uses](http://www.fao.org/nr/water/aquastat/data/query/index.html?lang=en) \"collection\" \"free\"\n-   [BODC - marine data of \\~22K vars](https://www.bodc.ac.uk/data/) \"collection\" \"credentials\"\n-   [Earth Models](http://www.earthmodels.org/) \"collection\" \"credentials\"\n-   [EOSDIS - NASA's earth observing system\n    data](http://sedac.ciesin.columbia.edu/data/sets/browse) \"collection\" \"credentials\"\n-   [Integrated Marine Observing System (IMOS) - roughly 30TB of ocean\n    measurements](https://imos.aodn.org.au) or [on\n    S3](http://imos-data.s3-website-ap-southeast-2.amazonaws.com/) \"collection\" \"free\"\n-   [Marinexplore - Open Oceanographic Data](http://marinexplore.org/) \"collection\" \"credentials\"\n-   [Smithsonian Institution Global Volcano and Eruption\n    Database](http://volcano.si.edu/) \"collection\" \"free\"\n-   [USGS Earthquake\n    Archives](http://earthquake.usgs.gov/earthquakes/search/) \"collection\" \"free\"\n\nEconomics\n---------\n\n-   [American Economic Association\n    (AEA)](https://www.aeaweb.org/resources/data) \"collection\" \"credentials\"\n-   [EconData from\n    UMD](http://inforumweb.umd.edu/econdata/econdata.html)\n-   [Economic Freedom of the World\n    Data](http://www.freetheworld.com/datasets_efw.html) \"collection\" \"payment\"\n-   [Historical MacroEconomc\n    Statistics](http://www.historicalstatistics.org/) \"collection\" \"free\"\n-   [International Economics Database](http://widukind.cepremap.org/)\n    and [various data tools](https://github.com/Widukind) \"collection\" \"free\"\n-   [International Trade Statistics](http://www.econostatistics.co.za/) \"collection\" \"free\"\n-   [Internet Product Code Database](http://www.upcdatabase.com/) \"collection\" \"credentials\"\n-   [Joint External Debt Data Hub](http://www.jedh.org/) \"collection\" \"free\"\n-   [Jon Haveman International Trade Data\n    Links](http://www.macalester.edu/research/economics/PAGE/HAVEMAN/Trade.Resources/TradeData.html) \"collection\" \"free\"\n-   [OpenCorporates Database of Companies in the\n    World](https://opencorporates.com/) \"collection\" \"credentials\"\n-   [Our World in Data](http://ourworldindata.org/) \"collection\" \"free\"\n-   [SciencesPo World Trade Gravity\n    Datasets](http://econ.sciences-po.fr/thierry-mayer/data) \"collection\" \"free\"\n-   [The Atlas of Economic Complexity](http://atlas.cid.harvard.edu) \"collection\" \"free\"\n-   [The Center for International Data](http://cid.econ.ucdavis.edu) \"collection\" \"free\"\n-   [The Observatory of Economic\n    Complexity](http://atlas.media.mit.edu/en/) \"collection\" \"free\"\n-   [UN Commodity Trade Statistics](http://comtrade.un.org/db/) \"collection\" \"credentials\"\n-   [UN Human Development Reports](http://hdr.undp.org/en) \"collection\" \"free\"\n\nEducation\n---------\n\n-   [College Scorecard Data](https://collegescorecard.ed.gov/data/) \"single\" \"free\"\n-   [Student Data from Free Code\n    Camp](http://academictorrents.com/details/030b10dad0846b5aecc3905692890fb02404adbf) \"single\" \"credentials\"\n\nEnergy\n------\n\n-   [AMPds](http://ampds.org/) \"single\" \"free\"\n-   [COMBED](http://combed.github.io/) \"single\" \"free\"\n-   [DRED](http://www.st.ewi.tudelft.nl/~akshay/dred/) \"collection\" \"credentials\"\n-   [ECO](http://www.vs.inf.ethz.ch/res/show.html?what=eco-data) \"single\" \"free\"\n-   [EIA](http://www.eia.gov/electricity/data/eia923/) \"collection\" \"free\"\n-   [HES](http://randd.defra.gov.uk/Default.aspx?Menu=Menu\u0026Module=More\u0026Location=None\u0026ProjectID=17359\u0026FromSearch=Y\u0026Publisher=1\u0026SearchText=EV0702\u0026SortString=ProjectCode\u0026SortOrder=Asc\u0026Paging=10#Description) \"single\" \"free\"\n    - Household Electricity Study, UK\n-   [HFED](http://hfed.github.io/) \"collection\" \"free\"\n-   [iAWE](http://iawe.github.io/) \"single\" \"free\"\n-   [PLAID](http://plaidplug.com/) - the Plug Load Appliance\n    Identification Dataset \"single\" \"free\"\n-   [REDD](http://redd.csail.mit.edu/) \"collection\" \"free\"\n-   [Tracebase](https://www.tracebase.org) \"collection\" \"free\"\n-   [UK-DALE](http://www.doc.ic.ac.uk/~dk3810/data/) - UK Domestic\n    Appliance-Level Electricity \"single\" \"free\"\n-   [WHITED](http://nilmworkshop.org/2016/proceedings/Poster_ID18.pdf) \"single\" \"free\"\n\nFinance\n-------\n\n-   [CBOE Futures Exchange](http://cfe.cboe.com/Data/) \"collection\" \"credentials\"\n-   [Google Finance](https://www.google.com/finance) \"collection\" \"credentials\"\n-   [Google\n    Trends](http://www.google.com/trends?q=google\u0026ctab=0\u0026geo=all\u0026date=all\u0026sort=0) \"collection\" \"credentials\"\n-   [NASDAQ](https://data.nasdaq.com/) \"collection\" \"credentials\"\n-   [NYSE Market Data](ftp://ftp.nyxdata.com) (see FTP link on\n    [RAW](https://raw.githubusercontent.com/caesar0301/awesome-public-datasets/master/README.rst)) \"collection\" \"free\"\n-   [OANDA](http://www.oanda.com/) \"collection\" \"credentials\"\n-   [OSU Financial data](http://fisher.osu.edu/fin/fdf/osudata.htm) \"collection\" \"free\"\n-   [Quandl](https://www.quandl.com/) \"collection\" \"credentials\"\n-   [St Louis Federal](https://research.stlouisfed.org/fred2/) \"collection\" \"credentials\"\n-   [Yahoo Finance](http://finance.yahoo.com/) \"collection\" \"credentials\"\n\nGIS\n---\n\n-   [ArcGIS Open Data portal](http://opendata.arcgis.com/) \"collection\" \"credentials\"\n-   [Cambridge, MA, US, GIS data on\n    GitHub](http://cambridgegis.github.io/gisdata.html) \"collection\" \"credentials\"\n-   [Factual Global Location Data](https://www.factual.com/) \"collection\" \"credentials\"\n-   [Geo Spatial Data from ASU](http://geodacenter.asu.edu/datalist/) \"collection\" \"credentials\"\n-   [Geo Wiki Project - Citizen-driven Environmental\n    Monitoring](http://geo-wiki.org/) \"collection\" \"credentials\"\n-   [GeoFabrik - OSM data extracted to a variety of formats and\n    areas](http://download.geofabrik.de/) \"collection\" \"free\"\n-   [GeoNames Worldwide](http://www.geonames.org/) \"collection\" \"credentials\"\n-   [Global Administrative Areas Database (GADM)](http://www.gadm.org/) \"collection\" \"free\"\n-   [Homeland Infrastructure Foundation-Level\n    Data](https://hifld-dhs-gii.opendata.arcgis.com/) \"collection\" \"free\"\n-   [Landsat 8 on AWS](https://aws.amazon.com/public-data-sets/landsat/) \"collection\" \"credentials\"\n-   [List of all countries in all\n    languages](https://github.com/umpirsky/country-list) \"collection\" \"payment\"\n-   [National Weather Service GIS Data\n    Portal](http://www.nws.noaa.gov/gis/) \"collection\" \"free\"\n-   [Natural Earth - vectors and rasters of the\n    world](http://www.naturalearthdata.com/) \"collection\" \"free\"\n-   [OpenAddresses](http://openaddresses.io/) \"collection\" \"free\"\n-   [OpenStreetMap\n    (OSM)](http://wiki.openstreetmap.org/wiki/Downloading_data) \"collection\" \"free\"\n-   [Pleiades - Gazetteer and graph of ancient\n    places](http://pleiades.stoa.org/) \"collection\" \"credentials\"\n-   [Reverse Geocoder using OSM\n    data](https://github.com/kno10/reversegeocode) \"collection\" \"payment\" \u0026 [additional\n    high-resolution data files](http://data.ub.uni-muenchen.de/61/) \"collection\" \"credentials\"\n-   [TIGER/Line - U.S. boundaries and\n    roads](http://www.census.gov/geo/maps-data/data/tiger-line.html) \"collection\" \"free\"\n-   [TwoFishes - Foursquare's coarse\n    geocoder](https://github.com/foursquare/twofishes) \"collection\" \"payment\"\n-   [TZ Timezones shapfiles](http://efele.net/maps/tz/world/) \"collection\" \"free\"\n-   [UN Environmental Data](http://geodata.grid.unep.ch/) \"collection\" \"credentials\"\n-   [World countries in multiple\n    formats](https://github.com/mledoze/countries) \"collection\" \"payment\"\n\nGovernment\n----------\n\n-   [A list of cities and countries contributed by\n    community](https://github.com/caesar0301/awesome-public-datasets/blob/master/Government.rst) \"collection\" \"payment\"\n-   [Open Data for Africa](http://opendataforafrica.org/) \"collection\" \"free\"\n-   [OpenDataSoft's list of 1,600 open\n    data](https://www.opendatasoft.com/a-comprehensive-list-of-all-open-data-portals-around-the-world/) \"collection\" \"payment\"\n\nHealthcare\n----------\n\n-   [EHDP Large Health Data\n    Sets](http://www.ehdp.com/vitalnet/datasets.htm) \"collection\" \"free\"\n-   [Gapminder World demographic\n    databases](http://www.gapminder.org/data/) \"collection\" \"free\"\n-   [Medicare Coverage Database (MCD),\n    U.S.](https://www.cms.gov/medicare-coverage-database/) \"collection\" \"free\"\n-   [Medicare Data Engine of medicare.gov\n    Data](https://data.medicare.gov/) \"collection\" \"credentials\"\n-   [Medicare Data File](http://go.cms.gov/19xxPN4) \"single\" \"free\"\n-   [MeSH, the vocabulary thesaurus used for indexing articles for\n    PubMed](https://www.nlm.nih.gov/mesh/filelist.html) \"collection\" \"free\"\n-   [Number of Ebola Cases and Deaths in Affected Countries\n    (2014)](https://data.hdx.rwlabs.org/dataset/ebola-cases-2014) \"collection\" \"credentials\"\n-   [Open-ODS (structure of the UK NHS)](http://www.openods.co.uk) \"single\" \"free\"\n-   [OpenPaymentsData, Healthcare financial relationship\n    data](https://openpaymentsdata.cms.gov) \"collection\" \"credentials\"\n-   [The Cancer Genome Atlas project\n    (TCGA)](https://tcga-data.nci.nih.gov/tcga/tcgaDownload.jsp) and\n    [BigQuery\n    table](http://google-genomics.readthedocs.org/en/latest/use_cases/discover_public_data/isb_cgc_data.html) \"collection\" \"free\"\n-   [World Health Organization Global Health\n    Observatory](http://www.who.int/gho/en/) \"collection\" \"free\"\n\nImage Processing\n----------------\n\n-   [10k US Adult Faces\n    Database](http://wilmabainbridge.com/facememorability2.html) \"collection\" \"credentials\"\n-   [2GB of Photos of\n    Cats](http://137.189.35.203/WebUI/CatDatabase/catData.html) or\n    [Archive\n    version](https://web.archive.org/web/20150520175645/http://137.189.35.203/WebUI/CatDatabase/catData.html) \"collection\" \"free\"\n-   [Adience Unfiltered faces for gender and age\n    classification](http://www.openu.ac.il/home/hassner/Adience/data.html) \"collection\" \"credentials\"\n-   [Affective Image Classification](http://www.imageemotion.org/) \"collection\" \"free\"\n-   [Animals with attributes](http://attributes.kyb.tuebingen.mpg.de/) \"single\" \"free\"\n-   [Caltech Pedestrian Detection\n    Benchmark](https://www.vision.caltech.edu/Image_Datasets/CaltechPedestrians/) \"collection\" \"free\"\n-   [Chars74K dataset, Character Recognition in Natural Images (both\n    English and Kannada are\n    available)](http://www.ee.surrey.ac.uk/CVSSP/demos/chars74k/) \"collection\" \"free\"\n-   [Face Recognition Benchmark](http://www.face-rec.org/databases/) \"collection\" \"free\"\n-   [GDXray: X-ray images for X-ray testing and Computer\n    Vision](http://dmery.ing.puc.cl/index.php/material/gdxray/) \"collection\" \"free\"\n-   [ImageNet (in WordNet hierarchy)](http://www.image-net.org/) \"collection\" \"free\"\n-   [Indoor Scene\n    Recognition](http://web.mit.edu/torralba/www/indoor.html) \"collection\" \"free\"\n-   [International Affective Picture System,\n    UFL](http://csea.phhp.ufl.edu/media/iapsmessage.html) \"collection\" \"free\"\n-   [Massive Visual Memory Stimuli,\n    MIT](http://cvcl.mit.edu/MM/stimuli.html) \"collection\" \"free\"\n-   [MNIST database of handwritten digits, near 1 million\n    examples](http://yann.lecun.com/exdb/mnist/) \"collection\" \"free\"\n-   [Stanford Dogs\n    Dataset](http://vision.stanford.edu/aditya86/ImageNetDogs/) \"single\" \"free\"\n-   [SUN database,\n    MIT](http://groups.csail.mit.edu/vision/SUN/hierarchy.html) \"collection\" \"free\"\n-   [The Action Similarity Labeling (ASLAN)\n    Challenge](http://www.openu.ac.il/home/hassner/data/ASLAN/ASLAN.html) \"collection\" \"free\"\n-   [The Oxford-IIIT Pet\n    Dataset](http://www.robots.ox.ac.uk/~vgg/data/pets/) \"single\" \"free\"\n-   [Violent-Flows - Crowd Violence Non-violence Database and\n    benchmark](http://www.openu.ac.il/home/hassner/data/violentflows/) \"collection\" \"credentials\"\n-   [Visual genome](http://visualgenome.org/api/v0/api_home.html) \"single\" \"free\"\n-   [YouTube Faces Database](http://www.cs.tau.ac.il/~wolf/ytfaces/) \"collection\" \"credentials\"\n\nMachine Learning\n----------------\n\n-   [Context-aware data sets from five\n    domains](https://github.com/irecsys/CARSKit/tree/master/context-aware_data_sets) \"collection\" \"payment\"\n-   [Delve Datasets for classification and regression (Univ. of\n    Toronto)](http://www.cs.toronto.edu/~delve/data/datasets.html) \"collection\" \"free\"\n-   [Discogs Monthly Data](http://data.discogs.com/) \"collection\" \"free\"\n-   [eBay Online Auctions\n    (2012)](http://www.modelingonlineauctions.com/datasets) \"collection\" \"free\"\n-   [IMDb Database](http://www.imdb.com/interfaces) \"collection\" \"credentials\"\n-   [Keel Repository for classification, regression and time\n    series](http://sci2s.ugr.es/keel/datasets.php) \"collection\" \"free\"\n-   [Labeled Faces in the Wild (LFW)](http://vis-www.cs.umass.edu/lfw/) \"collection\" \"free\"\n-   [Lending Club Loan\n    Data](https://www.lendingclub.com/info/download-data.action) \"collection\" \"credentials\"\n-   [Machine Learning Data Set Repository](http://mldata.org/) \"collection\" \"credentials\"\n-   [Million Song Dataset](http://labrosa.ee.columbia.edu/millionsong/) \"single\" \"free\"\n-   [More Song\n    Datasets](http://labrosa.ee.columbia.edu/millionsong/pages/additional-datasets) \"collection\" \"free\"\n-   [MovieLens Data Sets](http://grouplens.org/datasets/movielens/) \"collection\" \"free\"\n-   [New Yorker caption contest\n    ratings](https://github.com/nextml/caption-contest-data) \"collection\" \"payment\"\n-   [RDataMining - \"R and Data Mining\" ebook\n    data](http://www.rdatamining.com/data) \"collection\" \"free\"\n-   [Restaurants Health Score Data in San\n    Francisco](http://missionlocal.org/san-francisco-restaurant-health-inspections/) \"single\" \"free\"\n-   [UCI Machine Learning Repository](http://archive.ics.uci.edu/ml/) \"collection\" \"free\"\n-   [Yahoo! Ratings and Classification\n    Data](http://webscope.sandbox.yahoo.com/catalog.php?datatype=r) \"collection\" \"credentials\"\n-   [Youtube 8m](https://research.google.com/youtube8m/download.html) \"single\" \"free\"\n\nMuseums\n-------\n\n-   [Canada Science and Technology Museums Corporation's Open\n    Data](http://techno-science.ca/en/data.php) \"collection\" \"free\"\n-   [Cooper-Hewitt's Collection\n    Database](https://github.com/cooperhewitt/collection) \"collection\" \"payment\"\n-   [Minneapolis Institute of Arts\n    metadata](https://github.com/artsmia/collection) \"collection\" \"payment\"\n-   [Natural History Museum (London) Data\n    Portal](http://data.nhm.ac.uk/) \"collection\" \"free\"\n-   [Rijksmuseum Historical Art\n    Collection](https://www.rijksmuseum.nl/en/api) \"single\" \"free\"\n-   [Tate Collection\n    metadata](https://github.com/tategallery/collection) \"collection\" \"payment\"\n-   [The Getty vocabularies](http://vocab.getty.edu) \"collection\" \"free\"\n\nNatural Language\n----------------\n\n-   [Automatic Keyphrase\n    Extracttion](https://github.com/snkim/AutomaticKeyphraseExtraction/) \"collection\" \"payment\"\n-   [Blogger Corpus](http://u.cs.biu.ac.il/~koppel/BlogCorpus.htm) \"single\" \"free\"\n-   [CLiPS Stylometry Investigation\n    Corpus](http://www.clips.uantwerpen.be/datasets/csi-corpus) \"single\" \"credentials\"\n-   [ClueWeb09 FACC](http://lemurproject.org/clueweb09/FACC1/) \"single\" \"free\"\n-   [ClueWeb12 FACC](http://lemurproject.org/clueweb12/FACC1/) \"single\" \"free\"\n-   [DBpedia - 4.58M things with 583M\n    facts](http://wiki.dbpedia.org/Datasets) \"collection\" \"free\"\n-   [Flickr Personal\n    Taxonomies](http://www.isi.edu/~lerman/downloads/flickr/flickr_taxonomies.html) \"collection\" \"free\"\n-   [Freebase.com of people, places, and\n    things](http://www.freebase.com/) \"collection\" \"free\"\n-   [Google Books Ngrams\n    (2.2TB)](https://aws.amazon.com/datasets/google-books-ngrams/) \"single\" \"credentials\"\n-   [Google MC-AFP, generated based on the public available Gigaword\n    dataset using Paragraph Vectors](https://github.com/google/mcafp) \"collection\" \"payment\"\n-   [Google Web 5gram (1TB,\n    2006)](https://catalog.ldc.upenn.edu/LDC2006T13) \"collection\" \"credentials\"\n-   [Gutenberg eBooks\n    List](http://www.gutenberg.org/wiki/Gutenberg:Offline_Catalogs) \"collection\" \"free\"\n-   [Hansards text chunks of Canadian\n    Parliament](http://www.isi.edu/natural-language/download/hansard/) \"collection\" \"free\"\n-   [Machine Comprehension Test (MCTest) of text from Microsoft\n    Research](http://research.microsoft.com/en-us/um/redmond/projects/mctest/index.html) \"single\" \"free\"\n-   [Machine Translation of European\n    languages](http://statmt.org/wmt11/translation-task.html#download) \"collection\" \"free\"\n-   [Microsoft MAchine Reading COmprehension Dataset (or MS\n    MARCO)](http://www.msmarco.org/dataset.aspx) \"single\" \"free\"\n-   [Multi-Domain Sentiment Dataset (version\n    2.0)](http://www.cs.jhu.edu/~mdredze/datasets/sentiment/) \"single\" \"free\"\n-   [Open Multilingual Wordnet](http://compling.hss.ntu.edu.sg/omw/) \"collection\" \"free\"\n-   [Personae\n    Corpus](http://www.clips.uantwerpen.be/datasets/personae-corpus) \"single\" \"credentials\"\n-   [SaudiNewsNet Collection of Saudi Newspaper Articles (Arabic, 30K\n    articles)](https://github.com/ParallelMazen/SaudiNewsNet) \"collection\" \"payment\"\n-   [SMS Spam Collection in\n    English](http://www.dt.fee.unicamp.br/~tiago/smsspamcollection/) \"single\" \"free\"\n-   [Universal Dependencies](http://universaldependencies.org) \"collection\" \"free\"\n-   [USENET postings corpus of\n    2005\\~2011](http://www.psych.ualberta.ca/~westburylab/downloads/usenetcorpus.download.html) \"single\" \"credentials\"\n-   [Webhose - News/Blogs in multiple\n    languages](https://webhose.io/datasets) \"collection\" \"credentials\"\n-   [Wikidata - Wikipedia\n    databases](https://www.wikidata.org/wiki/Wikidata:Database_download) \"collection\" \"free\"\n-   [Wikipedia Links data - 40 Million Entities in\n    Context](https://code.google.com/p/wiki-links/downloads/list) \"collection\" \"free\"\n-   [WordNet databases and\n    tools](http://wordnet.princeton.edu/wordnet/download/) \"collection\" \"free\"\n\nNeuroscience\n------------\n\n-   [Allen Institute Datasets](http://www.brain-map.org/) \"collection\" \"free\"\n-   [Brain Catalogue](http://braincatalogue.org/) \"collection\" \"credentials\"\n-   [Brainomics](http://brainomics.cea.fr/localizer) \"single\" \"credentials\"\n-   [Collaborative Research in Computational Neuroscience\n    (CRCNS)](http://crcns.org/data-sets) \"collection\" \"free\"\n-   [FCP-INDI](http://fcon_1000.projects.nitrc.org/index.html) \"collection\" \"credentials\"\n-   [Human Connectome Project](http://www.humanconnectome.org/data/) \"collection\" \"free\"\n-   [NDAR](https://ndar.nih.gov/) \"collection\" \"credentials\"\n-   [NeuroData](http://neurodata.io) \"collection\" \"free\"\n-   [Neuroelectro](http://neuroelectro.org/) \"collection\" \"free\"\n-   [NIMH Data Archive](http://data-archive.nimh.nih.gov/) \"collection\" \"free\"\n-   [OASIS](http://www.oasis-brains.org/) \"collection\" \"free\"\n-   [OpenfMRI](https://openfmri.org/) \"collection\" \"credentials\"\n-   [Study Forrest](http://studyforrest.org) \"collection\" \"free\"\n\nPhysics\n-------\n\n-   [CERN Open Data Portal](http://opendata.cern.ch/) \"collection\" \"free\"\n-   [Crystallography Open Database](http://www.crystallography.net/) \"collection\" \"free\"\n-   [NASA Exoplanet Archive](http://exoplanetarchive.ipac.caltech.edu/) \"collection\" \"credentials\"\n-   [NSSDC (NASA) data of 550 space\n    spacecraft](http://nssdc.gsfc.nasa.gov/nssdc/obtaining_data.html) \"collection\" \"free\"\n-   [Sloan Digital Sky Survey (SDSS) - Mapping the\n    Universe](http://www.sdss.org/) \"collection\" \"free\"\n\n\nPublic Domains\n--------------\n\n-   [Amazon](http://aws.amazon.com/datasets/) \"collection\" \"credentials\"\n-   [Archive-it from Internet\n    Archive](https://www.archive-it.org/explore?show=Collections) \"collection\" \"credentials\"\n-   [Archive.org Datasets](https://archive.org/details/datasets) \"collection\" \"credentials\"\n-   [CMU JASA data archive](http://lib.stat.cmu.edu/jasadata/) \"collection\" \"free\"\n-   [CMU StatLab collections](http://lib.stat.cmu.edu/datasets/) \"collection\" \"free\"\n-   [Data.World](https://data.world) \"collection\" \"credentials\"\n-   [Data360](http://www.data360.org/index.aspx) \"collection\" \"free\"\n-   [Google](http://www.google.com/publicdata/directory) \"collection\" \"credentials\"\n-   [Infochimps](http://www.infochimps.com/) \"collection\" \"free\"\n-   [KDNuggets Data\n    Collections](http://www.kdnuggets.com/datasets/index.html) \"collection\" \"free\"\n-   [Microsoft Azure Data Market Free\n    DataSets](http://datamarket.azure.com/browse/data?price=free) \"collection\" \"credentials\"\n-   [Microsoft Data Science for Research](http://aka.ms/Data-Science) \"collection\" \"free\"\n-   [Open Library Data Dumps](https://openlibrary.org/developers/dumps) \"collection\" \"credentials\"\n-   [Reddit Datasets](https://www.reddit.com/r/datasets) \"collection\" \"credentials\"\n-   [RevolutionAnalytics\n    Collection](http://packages.revolutionanalytics.com/datasets/) \"collection\" \"free\"\n-   [Sample R data\n    sets](http://stat.ethz.ch/R-manual/R-patched/library/datasets/html/00Index.html) \"collection\" \"free\"\n-   [StatSci.org](http://www.statsci.org/datasets.html) \"collection\" \"free\"\n-   [The Washington Post\n    List](http://www.washingtonpost.com/wp-srv/metro/data/datapost.html) \"collection\" \"credentials\"\n-   [UCLA SOCR data\n    collection](http://wiki.stat.ucla.edu/socr/index.php/SOCR_Data) \"collection\" \"credentials\"\n-   [UFO Reports](http://www.nuforc.org/webreports.html) \"collection\" \"free\"\n-   [Wikileaks 911 pager\n    intercepts](https://911.wikileaks.org/files/index.html) \"collection\" \"free\"\n-   [Yahoo Webscope](http://webscope.sandbox.yahoo.com/catalog.php) \"collection\" \"free\"\n\nSearch Engines\n--------------\n\n-   [Academic Torrents of data sharing from\n    UMB](http://academictorrents.com/) \"collection\" \"credentials\"\n-   [Datahub.io](https://datahub.io/dataset) \"collection\" \"credentials\"\n-   [DataMarket (Qlik)](https://datamarket.com/data/list/?q=all) \"collection\" \"credentials\"\n-   [Harvard Dataverse Network of scientific\n    data](https://dataverse.harvard.edu/) \"collection\" \"credentials\"\n-   [ICPSR (UMICH)](http://www.icpsr.umich.edu/icpsrweb/ICPSR/index.jsp) \"collection\" \"credentials\"\n-   [Institute of Education Sciences](http://eric.ed.gov) \"collection\" \"free\"\n-   [National Technical Reports\n    Library](http://www.ntis.gov/products/ntrl/) \"collection\" \"free\"\n-   [Open Data Certificates\n    (beta)](https://certificates.theodi.org/en/datasets) \"collection\" \"credentials\"\n-   [OpenDataNetwork - A search engine of all Socrata powered data\n    portals](http://www.opendatanetwork.com/) \"collection\" \"free\"\n-   [Statista.com - statistics and Studies](http://www.statista.com/) \"collection\" \"credentials\"\n-   [Zenodo - An open dependable home for the long-tail of\n    science](https://zenodo.org/collection/datasets) \"collection\" \"credentials\"\n\nSocial Networks\n---------------\n\n-   [72 hours \\#gamergate Twitter\n    Scrape](http://waxy.org/random/misc/gamergate_tweets.csv) \"collection\" \"credentials\"\n-   [Ancestry.com Forum Dataset over 10\n    years](http://www.cs.cmu.edu/~jelsas/data/ancestry.com/) \"single\" \"free\"\n-   [Cheng-Caverlee-Lee September 2009 - January 2010 Twitter\n    Scrape](https://archive.org/details/twitter_cikm_2010) \"single\" \"free\"\n-   [CMU Enron Email of 150 users](http://www.cs.cmu.edu/~enron/) \"single\" \"free\"\n-   [EDRM Enron EMail of 151 users, hosted on\n    S3](https://aws.amazon.com/datasets/enron-email-data/) \"single\" \"credentials\"\n-   [Facebook Data Scrape\n    (2005)](https://archive.org/details/oxford-2005-facebook-matrix) \"single\" \"credentials\"\n-   [Facebook Social Networks from LAW (since\n    2007)](http://law.di.unimi.it/datasets.php) \"collection\" \"free\"\n-   [Foursquare from UMN/Sarwat\n    (2013)](https://archive.org/details/201309_foursquare_dataset_umn) \"single\" \"credentials\"\n-   [GitHub Collaboration Archive](https://www.githubarchive.org/) \"collection\" \"free\"\n-   [Google Scholar citation\n    relations](http://www3.cs.stonybrook.edu/~leman/data/gscholar.db) \"single\" \"free\"\n-   [High-Resolution Contact Networks from Wearable\n    Sensors](http://www.sociopatterns.org/datasets/) \"collection\" \"free\"\n-   [Mobile Social Networks from\n    UMASS](https://kdl.cs.umass.edu/display/public/Mobile+Social+Networks) \"single\" \"credentials\"\n-   [Network Twitter\n    Data](http://snap.stanford.edu/data/higgs-twitter.html) \"single\" \"free\"\n-   [Reddit\n    Comments](https://www.reddit.com/r/datasets/comments/3bxlg7/i_have_every_publicly_available_reddit_comment/) \"single\" \"credentials\"\n-   [Skytrax' Air Travel Reviews\n    Dataset](https://github.com/quankiquanki/skytrax-reviews-dataset) \"single\" \"payment\"\n-   [Social Twitter\n    Data](http://snap.stanford.edu/data/egonets-Twitter.html) \"single\" \"free\"\n-   [SourceForge.net Research\n    Data](http://www3.nd.edu/~oss/Data/data.html) \"collection\" \"free\"\n-   [Twitter Data for Online Reputation\n    Management](http://nlp.uned.es/replab2013/) \"single\" \"free\"\n-   [Twitter Data for Sentiment\n    Analysis](http://help.sentiment140.com/for-students/) \"collection\" \"free\"\n-   [Twitter Graph of entire Twitter\n    site](http://an.kaist.ac.kr/traces/WWW2010.html) \"single\" \"free\"\n-   [UNIMI/LAW Social Network\n    Datasets](http://law.di.unimi.it/datasets.php) \"collection\" \"free\"\n-   [Yahoo! Graph and Social\n    Data](http://webscope.sandbox.yahoo.com/catalog.php?datatype=g) \"collection\" \"credentials\"\n-   [Youtube Video Social Graph in\n    2007,2008](http://netsg.cs.sfu.ca/youtubedata/) \"single\" \"free\"\n\nSocial Sciences\n---------------\n\n-   [ACLED (Armed Conflict Location \u0026 Event Data\n    Project)](http://www.acleddata.com/) \"collection\" \"free\"\n-   [Canadian Legal Information\n    Institute](https://www.canlii.org/en/index.php) \"collection\" \"free\"\n-   [Center for Systemic Peace Datasets - Conflict Trends, Polities,\n    State Fragility, etc](http://www.systemicpeace.org/) \"collection\" \"free\"\n-   [Correlates of War Project](http://www.correlatesofwar.org/) \"collection\" \"credentials\"\n-   [Cryptome Conspiracy Theory Items](http://cryptome.org) \"collection\" \"payment\"\n-   [Datacards](http://datacards.org) \"collection\" \"credentials\"\n-   [European Social Survey](http://www.europeansocialsurvey.org/data/) \"collection\" \"credentials\"\n-   [FBI Hate Crime 2013 - aggregated\n    data](https://github.com/emorisse/FBI-Hate-Crime-Statistics/tree/master/2013) \"collection\" \"payment\"\n-   [Fragile States Index](http://fsi.fundforpeace.org/data) \"collection\" \"payment\"\n-   [GDELT Global Events Database](http://gdeltproject.org/data.html) \"collection\" \"free\"\n-   [General Social Survey (GSS) since 1972](http://gss.norc.org) \"collection\" \"free\"\n-   [German Social Survey](http://www.gesis.org/en/home/) \"collection\" \"free\"\n-   [Global Religious Futures\n    Project](http://www.globalreligiousfutures.org/) \"collection\" \"free\"\n-   [Humanitarian Data Exchange](https://data.hdx.rwlabs.org/) \"collection\" \"credentials\"\n-   [INFORM Index for Risk\n    Management](http://www.inform-index.org/Results/Global) \"collection\" \"credentials\"\n-   [Institute for Demographic Studies](http://www.ined.fr/en/) \"collection\" \"free\"\n-   [International Networks Archive](http://www.princeton.edu/~ina/) \"collection\" \"free\"\n-   [International Social Survey Program ISSP](http://www.issp.org) \"collection\" \"free\"\n-   [International Studies Compendium\n    Project](http://www.isacompendium.com/public/) \"collection\" \"credentials\"\n-   [James McGuire Cross National\n    Data](http://jmcguire.faculty.wesleyan.edu/welcome/cross-national-data/) \"collection\" \"free\"\n-   [MacroData Guide by Norsk samfunnsvitenskapelig\n    datatjeneste](http://nsd.uib.no) \"collection\" \"free\"\n-   [Minnesota Population Center](https://www.ipums.org/) \"collection\" \"payment\"\n-   [MIT Reality Mining\n    Dataset](http://realitycommons.media.mit.edu/realitymining.html) \"single\" \"free\"\n-   [Notre Dame Global Adaptation Index\n    (NG-DAIN)](http://index.gain.org/about/download) \"collection\" \"free\"\n-   [Open Crime and Policing Data in England, Wales and Northern\n    Ireland](https://data.police.uk/data/) \"collection\" \"free\"\n-   [Paul Hensel General International Data\n    Page](http://www.paulhensel.org/dataintl.html) \"collection\" \"free\"\n-   [PewResearch Internet Survey\n    Project](http://www.pewinternet.org/datasets/) \"collection\" \"free\"\n-   [PewResearch Society Data\n    Collection](http://www.pewresearch.org/data/download-datasets/) \"collection\" \"free\"\n-   [Political Polarity\n    Data](http://www3.cs.stonybrook.edu/~leman/data/14-icwsm-political-polarity-data.zip) \"single\" \"free\"\n-   [StackExchange Data Explorer](http://data.stackexchange.com/help) \"collection\" \"credentials\"\n-   [Terrorism Research and Analysis\n    Consortium](http://www.trackingterrorism.org/) \"collection\" \"credentials\"\n-   [Texas Inmates Executed Since\n    1984](http://www.tdcj.state.tx.us/death_row/dr_executed_offenders.html) \"single\" \"free\"\n-   [Titanic Survival Data\n    Set](https://github.com/caesar0301/awesome-public-datasets/tree/master/Datasets) or [on Kaggle](https://www.kaggle.com/c/titanic/data) \"single\" \"payment\"\n-   [UCB's Archive of Social Science Data\n    (D-Lab)](http://ucdata.berkeley.edu/) \"collection\" \"free\"\n-   [UCLA Social Sciences Data\n    Archive](http://dataarchives.ss.ucla.edu/Home.DataPortals.htm) \"collection\" \"free\"\n-   [UN Civil Society Database](http://esango.un.org/civilsociety/) \"collection\" \"free\"\n-   [Universities Worldwide](http://univ.cc/) \"collection\" \"free\"\n-   [UPJOHN for Labor Employment\n    Research](http://www.upjohn.org/services/resources/employment-research-data-center) \"collection\" \"free\"\n-   [Uppsala Conflict Data Program](http://ucdp.uu.se/) \"collection\" \"free\"\n-   [World Bank Open Data](http://data.worldbank.org/) \"collection\" \"free\"\n-   [WorldPop project - Worldwide human population\n    distributions](http://www.worldpop.org.uk/data/get_data/) \"collection\" \"free\"\n\nSoftware\n--------\n\n-   [FLOSSmole data about free, libre, and open source software\n    development](http://flossdata.syr.edu/data/) \"collection\" \"free\"\n\nSports\n------\n\n-   [Basketball (NBA/NCAA/Euro) Player Database and\n    Statistics](http://www.draftexpress.com/stats.php) \"collection\" \"credentials\"\n-   [Betfair Historical Exchange Data](http://data.betfair.com/) \"collection\" \"credentials\"\n-   [Cricsheet Matches (cricket)](http://cricsheet.org/) \"collection\" \"free\"\n-   [Ergast Formula 1, from 1950 up to date\n    (API)](http://ergast.com/mrd/db) \"collection\" \"credentials\"\n-   [Football/Soccer resources (data and\n    APIs)](http://www.jokecamp.com/blog/guide-to-football-and-soccer-data-and-apis/) \"collection\" \"free\"\n-   [Lahman's Baseball\n    Database](http://www.seanlahman.com/baseball-archive/statistics/) \"collection\" \"free\"\n-   [Pinhooker: Thoroughbred Bloodstock Sale\n    Data](https://github.com/phillc73/pinhooker) \"collection\" \"payment\"\n-   [Retrosheet Baseball Statistics](http://www.retrosheet.org/game.htm) \"collection\" free\"\n-   [Tennis database of rankings, results, and stats for\n    ATP](https://github.com/JeffSackmann/tennis_atp),\n    [WTA](https://github.com/JeffSackmann/tennis_wta), [Grand\n    Slams](https://github.com/JeffSackmann/tennis_slam_pointbypoint) and\n    [Match Charting\n    Project](https://github.com/JeffSackmann/tennis_MatchChartingProject) \"collection\" \"payment\"\n\nTime Series\n-----------\n\n-   [Databanks International Cross National Time Series Data\n    Archive](http://www.cntsdata.com) \"collection\" \"payment\"\n-   [Hard Drive Failure\n    Rates](https://www.backblaze.com/hard-drive-test-data.html) \"collection\" \"credentials\"\n-   [Heart Rate Time Series from MIT](http://ecg.mit.edu/time-series/) \"collection\" \"free\"\n-   [Time Series Data Library (TSDL) from\n    MU](https://datamarket.com/data/list/?q=provider:tsdl) \"collection\" \"credentials\"\n-   [UC Riverside Time Series\n    Dataset](http://www.cs.ucr.edu/~eamonn/time_series_data/) \"collection\" \"free\"\n\nTransportation\n--------------\n\n-   [Airlines OD Data\n    1987-2008](http://stat-computing.org/dataexpo/2009/the-data.html) \"collection\" \"credentials\"\n-   [Bay Area Bike Share\n    Data](http://www.bayareabikeshare.com/open-data) \"collection\" \"credentials\"\n-   [Bike Share Systems (BSS)\n    collection](https://github.com/BetaNYC/Bike-Share-Data-Best-Practices/wiki/Bike-Share-Data-Systems) \"collection\" \"payment\"\n-   [GeoLife GPS Trajectory from Microsoft\n    Research](http://research.microsoft.com/en-us/downloads/b16d359d-d164-469e-9fd4-daa38f2b2e13/) \"collection\" \"credentials\"\n-   [German train system by Deutsche\n    Bahn](http://data.deutschebahn.com/datasets/) \"collection\" \"free\"\n-   [Hubway Million Rides in\n    MA](http://hubwaydatachallenge.org/trip-history-data/) \"single\" \"free\"\n-   [Marine Traffic - ship tracks, port calls and\n    more](http://www.marinetraffic.com/de/ais-api-services) \"collection\" \"credentials\"\n-   [Montreal BIXI Bike Share](https://montreal.bixi.com/en/open-data) \"single\" \"free\"\n-   [NYC Taxi Trip Data\n    2009-](http://www.nyc.gov/html/tlc/html/about/trip_record_data.shtml) \"collection\" \"free\"\n-   [NYC Taxi Trip Data 2013\n    (FOIA/FOILed)](https://archive.org/details/nycTaxiTripData2013) \"single\" \"credentials\"\n-   [NYC Uber trip data April 2014 to September\n    2014](https://github.com/fivethirtyeight/uber-tlc-foil-response) \"collection\" \"payment\"\n-   [Open Traffic\n    collection](https://github.com/graphhopper/open-traffic-collection) \"collection\" \"credentials\"\n-   [OpenFlights - airport, airline and route\n    data](http://openflights.org/data.html) \"collection\" \"free\"\n-   [Philadelphia Bike Share Stations\n    (JSON)](https://www.rideindego.com/stations/json/) \"single\" \"free\"\n-   [Plane Crash Database, since\n    1920](http://www.planecrashinfo.com/database.htm) \"collection\" \"free\"\n-   [RITA Airline On-Time Performance\n    data](http://www.transtats.bts.gov/Tables.asp?DB_ID=120) \"collection\" \"free\"\n-   [RITA/BTS transport data collection\n    (TranStat)](http://www.transtats.bts.gov/DataIndex.asp) \"collection\" \"free\"\n-   [Transport for London\n    (TFL)](https://tfl.gov.uk/info-for/open-data-users/our-open-data) \"collection\" \"free\"\n-   [Travel Tracker Survey (TTS) for\n    Chicago](http://www.cmap.illinois.gov/data/transportation/travel-tracker-survey) \"collection\" \"free\"\n-   [U.S. Bureau of Transportation Statistics\n    (BTS)](http://www.rita.dot.gov/bts/) \"collection\" \"free\"\n-   [U.S. Domestic Flights 1990 to\n    2009](http://academictorrents.com/details/a2ccf94bbb4af222bf8e69dad60a68a29f310d9a) \"single\" \"payment\"\n-   [U.S. Freight Analysis Framework since\n    2007](http://ops.fhwa.dot.gov/freight/freight_analysis/faf/index.htm) \"collection\" \"free\"\n\nComplementary Collections\n-------------------------\n\n-   [Data Packaged Core Datasets](https://github.com/datasets/) \"collection\" \"payment\"\n-   [Database of Scientific Code\n    Contributions](https://mozillascience.org/collaborate) \"collection\" \"free\"\n-   A growing collection of public datasets:\n    [CoolDatasets.](http://cooldatasets.com/) \"collection\" \"free\"\n-   Inside-r: [Finding Data on the\n    Internet](http://www.inside-r.org/howto/finding-data-internet) \"collection\" \"free\"\n-   OpenDataMonitor: [An overview of available open data resources in\n    Europe](http://opendatamonitor.eu) \"collection\" \"free\"\n-   Quora: [Where can I find large datasets open to the\n    public?](http://www.quora.com/Where-can-I-find-large-datasets-open-to-the-public) \"collection\" \"credentials\"\n-   RS.io: [100+ Interesting Data Sets for\n    Statistics](http://rs.io/100-interesting-data-sets-for-statistics/) \"collection\" \"free\"\n-   StaTrek: [Leveraging open data to understand urban\n    lives](http://xiaming.me/posts/2014/10/23/leveraging-open-data-to-understand-urban-lives/) \"collection\" \"free\"\n\n\n\n\n\n\n\n\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fasampat3090%2Fopen-datasets","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fasampat3090%2Fopen-datasets","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fasampat3090%2Fopen-datasets/lists"}