{"id":21396177,"url":"https://github.com/secdr/research-database","last_synced_at":"2026-02-12T21:03:08.829Z","repository":{"id":92450088,"uuid":"43729169","full_name":"secdr/research-database","owner":"secdr","description":"public database for research","archived":false,"fork":false,"pushed_at":"2017-04-22T16:07:12.000Z","size":6,"stargazers_count":17,"open_issues_count":0,"forks_count":9,"subscribers_count":4,"default_branch":"master","last_synced_at":"2025-09-22T09:08:16.266Z","etag":null,"topics":[],"latest_commit_sha":null,"homepage":null,"language":null,"has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/secdr.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2015-10-06T04:07:34.000Z","updated_at":"2023-07-17T08:44:17.000Z","dependencies_parsed_at":"2023-06-01T03:30:39.200Z","dependency_job_id":null,"html_url":"https://github.com/secdr/research-database","commit_stats":null,"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"purl":"pkg:github/secdr/research-database","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/secdr%2Fresearch-database","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/secdr%2Fresearch-database/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/secdr%2Fresearch-database/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/secdr%2Fresearch-database/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/secdr","download_url":"https://codeload.github.com/secdr/research-database/tar.gz/refs/heads/master","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/secdr%2Fresearch-database/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":29381043,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-02-12T20:34:40.886Z","status":"ssl_error","status_checked_at":"2026-02-12T20:23:00.490Z","response_time":55,"last_error":"SSL_connect returned=1 errno=0 peeraddr=140.82.121.6:443 state=error: unexpected eof while reading","robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":false,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":[],"created_at":"2024-11-22T14:25:25.623Z","updated_at":"2026-02-12T21:03:08.814Z","avatar_url":"https://github.com/secdr.png","language":null,"funding_links":[],"categories":[],"sub_categories":[],"readme":"# research-database\nFocus on collecting different public database for research. If you have any links please contact me or push to the repository.\n\n\n### Phishing\n+ [PhishTank](https://www.phishtank.com/developer_info.php);\n+ [OpenPhish](https://www.openphish.com/);\n+ [315online](http://www.315online.com.cn/list.php?catid=33);\n+ [中国移动垃圾短信](http://www.wid.org.cn/project/2015ccf/comp_detail.php?cid=227);\n+ [360最近恶意网站列表](http://webscan.360.cn/url)\n\n### Social data\n+ [Reddit Comments Corpus](https://archive.org/details/2015_reddit_comments_corpus);\n+ [Full Reddit Submission Corpus](https://www.reddit.com/r/datasets/comments/3mg812/full_reddit_submission_corpus_now_available_2006/);\n+ [City Record Online](https://nycopendata.socrata.com/);\n+ [TLC Trip Record Data](http://www.nyc.gov/html/tlc/html/about/trip_record_data.shtml);\n+ [Frequency Word Lists](https://invokeit.wordpress.com/frequency-word-lists/);\n+ [Amazon product data](http://jmcauley.ucsd.edu/data/amazon/);\n+ [Wikimedia database](https://dumps.wikimedia.org/);\n+ [Airbnb database](http://insideairbnb.com/get-the-data.html);\n\n### Network data\n+ [KDD Cup 1999 Data](http://kdd.ics.uci.edu/databases/kddcup99/kddcup99.html);\n\n### Security Data\n+ [Driving in the Cloud Dataset](http://malicia-project.com/dataset.html);\n+ [Nothink Malware samples](http://www.nothink.org/honeypots/malware-archives/)\n+ [SecRepo.com - Samples of Security Related Data](http://www.secrepo.com/) ****\n+ [lanl.gov Open Data Sets](http://csr.lanl.gov/data/);\n+ [Crime data from the St. Louis Metropolitan Police Departments](https://github.com/kylesykes/stl-crime-data);\n+ [Chronology of Data Breaches Security Breaches 2005 - Present](https://www.privacyrights.org/data-breach);\n+ [Malware Sample Sources for Researchers](https://zeltser.com/malware-sample-sources/);\n+ [Microsoft Malware Classification Challenge (BIG 2015)](https://www.kaggle.com/c/malware-classification/forums);\n+ [Android Malware-The Drebin Dataset](http://user.informatik.uni-goettingen.de/~darp/drebin/);\n\n\n### Others\n+ [beijing data](http://www.beijingcitylab.com/data-released-1/)\n\n### [Stanford Large Network Dataset Collection](http://snap.stanford.edu/data)\n+ Social networks : online social networks, edges represent interactions between people\n+ Networks with ground-truth communities : ground-truth network communities in social and information networks\n+ Communication networks : email communication networks with edges representing communication\n+ Citation networks : nodes represent papers, edges represent citations\n+ Collaboration networks : nodes represent scientists, edges represent collaborations (co-authoring a paper)\n+ Web graphs : nodes represent webpages and edges are hyperlinks\n+ Amazon networks : nodes represent products and edges link commonly co-purchased products\n+ Internet networks : nodes represent computers and edges communication\n+ Road networks : nodes represent intersections and edges roads connecting the intersections\n+ Autonomous systems : graphs of the internet\n+ Signed networks : networks with positive and negative edges (friend/foe, trust/distrust)\n+ Location-based online social networks : Social networks with geographic check-ins\n+ Wikipedia networks and metadata : Talk, editing and voting data from Wikipedia\n+ Twitter and Memetracker : Memetracker phrases, links and 467 million Tweets\n+ Online communities : Data from online communities such as Reddit and Flickr\n+ Online reviews : Data from online review systems such as BeerAdvocate and Amazon\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fsecdr%2Fresearch-database","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fsecdr%2Fresearch-database","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fsecdr%2Fresearch-database/lists"}