{"id":13416066,"url":"https://github.com/datasciencemasters/data","last_synced_at":"2026-01-27T11:02:06.104Z","repository":{"id":19772813,"uuid":"23031054","full_name":"datasciencemasters/data","owner":"datasciencemasters","description":"Open Data Sources","archived":false,"fork":false,"pushed_at":"2018-05-08T15:39:42.000Z","size":10,"stargazers_count":502,"open_issues_count":3,"forks_count":189,"subscribers_count":69,"default_branch":"master","last_synced_at":"2024-07-31T21:55:13.771Z","etag":null,"topics":[],"latest_commit_sha":null,"homepage":null,"language":null,"has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/datasciencemasters.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null}},"created_at":"2014-08-17T01:52:26.000Z","updated_at":"2024-07-23T16:46:53.000Z","dependencies_parsed_at":"2022-08-21T00:20:06.370Z","dependency_job_id":null,"html_url":"https://github.com/datasciencemasters/data","commit_stats":null,"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"purl":"pkg:github/datasciencemasters/data","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/datasciencemasters%2Fdata","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/datasciencemasters%2Fdata/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/datasciencemasters%2Fdata/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/datasciencemasters%2Fdata/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/datasciencemasters","download_url":"https://codeload.github.com/datasciencemasters/data/tar.gz/refs/heads/master","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/datasciencemasters%2Fdata/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":28812367,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-01-27T07:41:26.337Z","status":"ssl_error","status_checked_at":"2026-01-27T07:41:08.776Z","response_time":168,"last_error":"SSL_connect returned=1 errno=0 peeraddr=140.82.121.6:443 state=error: unexpected eof while reading","robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":false,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":[],"created_at":"2024-07-30T21:00:54.008Z","updated_at":"2026-01-27T11:02:06.082Z","avatar_url":"https://github.com/datasciencemasters.png","language":null,"readme":"## Open Data Sources\n\n* _**Availability and access**: the data must be available as a whole and at no more than a reasonable reproduction cost, preferably by downloading over the internet. The data must also be available in a convenient and modifiable form._\n* _**Reuse and redistribution**: the data must be provided under terms that permit reuse and redistribution including the intermixing with other datasets. The data must be machine-readable._\n* _**Universal participation**: everyone must be able to use, reuse and redistribute — there should be no discrimination against fields of endeavour or against persons or groups. For example, ‘non-commercial’ restrictions that would prevent ‘commercial’ use, or restrictions of use for certain purposes (e.g. only in education), are not allowed._\n\n-- _Definition by the [Open Knowledge Foundation](https://okfn.org/opendata/)_\n\n### Lists of Data Sets\n* [Interesting Data Sets for Statisticians](http://rs.io/100-interesting-data-sets-for-statistics/) - editorialized, entertaining set of open data\n\n### Open Data\n\n* [List of Public Datasets](https://github.com/caesar0301/awesome-public-datasets) - user-curated\n* [DBpedia](http://wiki.dbpedia.org/Datasets) - utilizing a large multi-domain ontology\n* [Public Data Sets on AWS](https://aws.amazon.com/datasets?_encoding=UTF8\u0026jiveRedirect=1) - common web crawl corpus, NASA satellite imagery, Human Genome, Google Book NGrams, Wikipedia Traffic, Million Song Dataset, Federal Reserve Economic Data, PubChem, more.\n\n### Private Opened Data\n* [New York Times](http://data.nytimes.com/) - vocabulary as linked open data; linked vocabulary of people, places, companies, etc.\n\n### Governmental Data\n\n[Compendium of Governmental Open Data Sources](http://datacatalogs.org/)\n\n* [Data.gov (USA)](http://www.data.gov/)\n* [Africa Open Data](http://africaopendata.org/dataset)\n* [US Census](http://www.census.gov/data/developers/data-sets.html) - Population Estimates and Projections, Nonemployer Statistics and County Business Patterns, Economic Indicators Time Series, more.\n\n### Non-Governmental Org Data\n\n* [The World Bank](http://data.worldbank.org/topic/private-sector) - business regulation measures, company-level data in emerging markets, household consumption patterns, World Development Indicators, World Bank finances\n* ^[Pew Research Center's Internet Project](http://www.pewinternet.org/datasets/pages/3/)\n\n### Academic Data\n\n[Inter-university Consortium for Political and Social Research Data Portal](http://www.icpsr.umich.edu/icpsrweb/ICPSR/access/subject.jsp)\n\n* [Surveys of Economic Attitudes and Behavior](http://www.icpsr.umich.edu/icpsrweb/ICPSR/studies?classification=ICPSR.IV.B.)\n* [Continuing Series of Consumer Surveys](http://www.icpsr.umich.edu/icpsrweb/ICPSR/studies?classification=ICPSR.IV.A.)\n* [Historical and Contemporary Economic Processes and Indicators](http://www.icpsr.umich.edu/icpsrweb/ICPSR/studies?classification=ICPSR.IV.C.)\n\n### Truly Random Data\n\n* [200,000+ Jeopardy! Questions in a JSON file](http://www.reddit.com/r/datasets/comments/1uyd0t/200000_jeopardy_questions_in_a_json_file/)\n* [10,000 annotated images of cats](http://137.189.35.203/WebUI/CatDatabase/catData.html)\n\n## Open Data Resources\n\n* reddit [r/datasets](http://www.reddit.com/r/datasets/)\n* [Open Data - Stack Exchange](http://opendata.stackexchange.com/) (discussion)\n\n^ _license is not truly open, involves some limitations_\n","funding_links":[],"categories":["Others","Fun","📚 Skill Development \u0026 Career","Data Sets"],"sub_categories":["Datasets","Data Sources \u0026 Datasets"],"project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fdatasciencemasters%2Fdata","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fdatasciencemasters%2Fdata","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fdatasciencemasters%2Fdata/lists"}