https://github.com/secdr/research-database
public database for research
https://github.com/secdr/research-database
Last synced: 4 months ago
JSON representation
public database for research
- Host: GitHub
- URL: https://github.com/secdr/research-database
- Owner: secdr
- Created: 2015-10-06T04:07:34.000Z (over 10 years ago)
- Default Branch: master
- Last Pushed: 2017-04-22T16:07:12.000Z (about 9 years ago)
- Last Synced: 2025-09-22T09:08:16.266Z (9 months ago)
- Size: 5.86 KB
- Stars: 17
- Watchers: 4
- Forks: 9
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
# research-database
Focus on collecting different public database for research. If you have any links please contact me or push to the repository.
### Phishing
+ [PhishTank](https://www.phishtank.com/developer_info.php);
+ [OpenPhish](https://www.openphish.com/);
+ [315online](http://www.315online.com.cn/list.php?catid=33);
+ [中国移动垃圾短信](http://www.wid.org.cn/project/2015ccf/comp_detail.php?cid=227);
+ [360最近恶意网站列表](http://webscan.360.cn/url)
### Social data
+ [Reddit Comments Corpus](https://archive.org/details/2015_reddit_comments_corpus);
+ [Full Reddit Submission Corpus](https://www.reddit.com/r/datasets/comments/3mg812/full_reddit_submission_corpus_now_available_2006/);
+ [City Record Online](https://nycopendata.socrata.com/);
+ [TLC Trip Record Data](http://www.nyc.gov/html/tlc/html/about/trip_record_data.shtml);
+ [Frequency Word Lists](https://invokeit.wordpress.com/frequency-word-lists/);
+ [Amazon product data](http://jmcauley.ucsd.edu/data/amazon/);
+ [Wikimedia database](https://dumps.wikimedia.org/);
+ [Airbnb database](http://insideairbnb.com/get-the-data.html);
### Network data
+ [KDD Cup 1999 Data](http://kdd.ics.uci.edu/databases/kddcup99/kddcup99.html);
### Security Data
+ [Driving in the Cloud Dataset](http://malicia-project.com/dataset.html);
+ [Nothink Malware samples](http://www.nothink.org/honeypots/malware-archives/)
+ [SecRepo.com - Samples of Security Related Data](http://www.secrepo.com/) ****
+ [lanl.gov Open Data Sets](http://csr.lanl.gov/data/);
+ [Crime data from the St. Louis Metropolitan Police Departments](https://github.com/kylesykes/stl-crime-data);
+ [Chronology of Data Breaches Security Breaches 2005 - Present](https://www.privacyrights.org/data-breach);
+ [Malware Sample Sources for Researchers](https://zeltser.com/malware-sample-sources/);
+ [Microsoft Malware Classification Challenge (BIG 2015)](https://www.kaggle.com/c/malware-classification/forums);
+ [Android Malware-The Drebin Dataset](http://user.informatik.uni-goettingen.de/~darp/drebin/);
### Others
+ [beijing data](http://www.beijingcitylab.com/data-released-1/)
### [Stanford Large Network Dataset Collection](http://snap.stanford.edu/data)
+ Social networks : online social networks, edges represent interactions between people
+ Networks with ground-truth communities : ground-truth network communities in social and information networks
+ Communication networks : email communication networks with edges representing communication
+ Citation networks : nodes represent papers, edges represent citations
+ Collaboration networks : nodes represent scientists, edges represent collaborations (co-authoring a paper)
+ Web graphs : nodes represent webpages and edges are hyperlinks
+ Amazon networks : nodes represent products and edges link commonly co-purchased products
+ Internet networks : nodes represent computers and edges communication
+ Road networks : nodes represent intersections and edges roads connecting the intersections
+ Autonomous systems : graphs of the internet
+ Signed networks : networks with positive and negative edges (friend/foe, trust/distrust)
+ Location-based online social networks : Social networks with geographic check-ins
+ Wikipedia networks and metadata : Talk, editing and voting data from Wikipedia
+ Twitter and Memetracker : Memetracker phrases, links and 467 million Tweets
+ Online communities : Data from online communities such as Reddit and Flickr
+ Online reviews : Data from online review systems such as BeerAdvocate and Amazon